Scatterplot

A scatterplot (scatter plot, scatter chart, or scatter graph) is a two-dimensional data visualization using dots to represent two variables—one on the x-axis, one on the y-axis—to reveal relationships like linear, parabolic, or hyperbolic trends.

Quick details:

What

Discover Change, Distribution

Why:

Determine if cause and effect interrelate and predict future trends.

History of Scatterplot

Statistician Edward Tufte notes scatterplots comprise over 70% of scientific publication charts. The first appeared in 1833 by John Frederick W. Herschel, plotting double star positional angles against measurement year to uncover orbital relationships, not mere trends.

Herschel’s data on the orbits of Virginis, together with his eye-smoothed, interpolated curve (solid line, hollow circles) and a less-smoothed curve (gray, dashed). Circles around each data point are of size proportional to the weight for each observation.

Source

When to Use a Scatterplot?

1

To discover trends in data distribution

Scatterplots visualize data spread and trends; tight clustering indicates strong relationships, enhanced by regression lines. Use color/shape/size for a third variable.

Two Scatterplots: Visualizing correlations and patterns for insightful data analysis.
 The Atlantic Cities (2012) plots a city’s “Metro Health Index” (a factor measuring the share of people who smoke or are obese) as it correlates to the city’s median income.

Source

2

For inferential statistics and trend prediction

Validate hypotheses with regression lines (linear, quadratic, etc.) to quantify fit. Enables interpolation (within data) and extrapolation (beyond data) for predictions.

Scatterplots of reported versus extrapolated annual number of drinks from (a), 1-month (b), 3-month

Source

3

To measure correlation strength

Calculate correlation coefficients: positive (both increase), negative (one increases as other decreases), high (tight fit to line/curve). Closer clustering to best-fit line signals stronger relationships.

Scatterplot Variations: Visualizing 5 correlation types.
Regresion coefficient expressing different types of correlation existing in a scatterplot

Source

Types of Scatterplots

1. Bubble Chart

Adds a third dimension via bubble size/area for multivariate analysis.

2. Rug Plot

One-dimensional marks along an axis to show univariate distribution—like a histogram with zero-width bins.

3. Line Chart

Connects markers with lines for sequential trends (distinguishes from disconnected scatterplots).

When Not to Use a Scatterplot?

1

With non-numeric or categorical data

Use bar graphs for categories (e.g., departments vs. revenue) or line charts for ordinal data (e.g., scores); scatterplots require paired continuous interval variables.

2

To show rate of change between points

Line graphs better convey slopes between sequential points; scatterplots emphasize overall trends without direct connections.

Share on

Was this Page helpful?

Get in Touch

Partner with us to bring your ideas to life.

Thank you for your feedback.