Autocorrelation Function Explained for Beginners

In the previous post, we covered the main visualization tools for time series: time plots, seasonal plots, subseries plots, scatterplots, and lag plots. This post goes one level deeper into two of them; lag plots and the ACF (Autocorrelation Function). because they carry more information than they first appear to, and because reading them correctly matters a lot once you start building models.

Starting Point: What Is a Lag Plot?

Before the ACF makes sense, the lag plot needs to make sense.

A lag plot asks a simple question: how much does a time series resemble a shifted version of itself?

To build one, you take every observation in the series and pair it with the observation that came one step before it. Then you plot those pairs: the x-axis shows the earlier value y(t), and the y-axis shows the value that came after it, y(t+1). Every dot on the chart is one such pair. A tight diagonal means the series remembers where it was. A cloud means it doesn’t.

Analyzing ACF plot to interpret lag and autocorrelation in data. — **Figure 1.** Three lag plot scenarios: positive autocorrelation (points line up bottom-left to top-right, when the previous value is high, the next one tends to be high too.), negative autocorrelation (points run top-left to bottom-right, high values are followed by low ones, and vice versa.), and no relationship (cloud, knowing the previous value tells you nothing about the next.). Each dot is one observation paired with the one before it.

This is Lag 1, comparing each observation to the one immediately before it. But you can do this for any lag. Lag 2 compares each observation to the one two steps back. Lag 12 compares to 12 steps back, which for monthly data means the same month last year. A tight diagonal at lag 12 is the visual signature of annual seasonality.

Analyzing an ACF plot to interpret seasonal and lag patterns. — **Figure 2.** Lag plots for lags 1 through 12 on a series with both trend and seasonality. Lags 1–4 show tight diagonals (short-term memory); lag 12 (highlighted in red) shows a tight diagonal again (annual memory); lags in between are messier.

The ACF: One Number Per Lag

Drawing twelve separate lag plots to spot twelve relationships is tedious. The ACF condenses all of them into a single chart.

For each lag, the ACF computes one number: the correlation coefficient. That summarises how diagonal the corresponding lag plot is. It then lines all those numbers up in a bar chart: one bar per lag. The coefficient runs from −1 (perfect negative relationship) through 0 (no relationship; cloud) to +1 (perfect positive relationship).

Analyzing an ACF plot with red and blue bars showing lag correlations. — **Figure 3.** An annotated ACF plot. The x-axis shows the lag number; the y-axis shows the correlation coefficient for that lag. Bars at lags 1–5 decline slowly (trend signature); peaks at lags 12 and 24 indicate annual seasonality. The combination produces the characteristic scalloped shape.

The bars at lags 1 through 5 are high and positive and they decline slowly — the signature of a trend. In a trending series, today resembles yesterday, yesterday resembles the day before, and the chain of similarity stretches back many steps. The peaks at lags 12 and 24 are the signature of annual seasonality: this month correlates with the same month one and two years ago. Together, slow decline and periodic bumps produce what’s sometimes called a “scalloped” shape — the hallmark of a series with both trend and seasonality.

The Confidence Bands

The blue shaded region on an ACF plot is the confidence band. Bars inside it are likely noise. Bars outside it are likely a real signal. But where does that threshold come from?

Analyzing an ACF plot for better understanding of data patterns. — **Figure 4.** Left: the standard normal distribution, with 95% of values falling within ±1.96 standard deviations. Right: how the confidence band narrows as sample size grows, more data means less uncertainty in each autocorrelation estimate.

Even if a series is completely random, chance alone will produce small non-zero autocorrelation values when computed from finite data. The question is: how large can a chance fluctuation get before it’s no longer plausible as mere chance?

When you measure the autocorrelation of a truly random series, the results follow a normal distribution centred on zero, and 95% of values fall within ±1.96 standard deviations. That’s where 1.96 comes from, the cutoff for 95% confidence. The standard deviation of an autocorrelation estimate from n observations is approximately 1/√n: the more data you have, the smaller the uncertainty. Putting them together: ±1.96/√n is the range within which 95% of autocorrelation measurements would fall if the series were truly random. For n = 100, that’s roughly ±0.20; for n = 400, it narrows to ±0.10.

What a bar outside the band means: the probability of seeing a correlation that large by chance alone is less than 5%. There is likely a real pattern here.

One practical nuance: if you plot 40 lags, you’d expect about 2 of them to fall outside the bands purely by chance (5% × 40 = 2). One bar barely crossing the line is less convincing than several bars clearly and consistently outside it.

Reading the ACF: A Pattern Guide

Illustration of ACF plots showing trend, seasonality, and white noise patterns. — **Figure 5.** Four time series (top row) and their corresponding ACF plots (bottom row): trend only, seasonality only, trend and seasonality combined, and white noise.

Trend only: The ACF starts high and declines slowly — bars stay positive across many lags. The series is long-memoried: even observations far apart in time are still related, because the trend carries them in the same direction.

Seasonality only: Clear peaks at lags 12, 24, and 36; lags in between are close to zero. The series doesn’t remember its recent past in general, but it does remember the same point in the previous cycle.

Trend + seasonality: Slowly declining bars overall, with bumps at the seasonal lags. This is the scalloped shape — the most common pattern in real business and economic data.

White noise: All bars inside the confidence band. No structure, no memory. There is nothing for a model to learn.

What Individual Lags Tell You

Beyond the big picture patterns, individual lag values carry specific interpretations:

Lag 1 outside the band: Consecutive observations are related. Yesterday’s value is a meaningful predictor of today’s. This could reflect a trend, a slow-moving process, or short-term momentum.

Lags 1 and 2 both outside, then inside from lag 3: The series has a short memory — only the last two steps matter. Lags further back don’t add information.

Lags 1, 2, 3… all outside, slowly declining: Long memory. The influence of the past fades slowly, which is characteristic of a strong trend.

Only lag 12 outside (monthly data): Pure seasonality. The same month last year is a good predictor of this month, but the months in between add nothing.

Lag 1 and lag 12 both outside: Both short-term dependency and annual seasonality are present. A good model needs to account for both.

The Residuals ACF: An X-Ray for Your Model

Once you’ve built a forecasting model, the ACF has a second important job: diagnosing whether the model actually worked.

After fitting a model, you compute the residuals — the gap between what the model predicted and what actually happened. Ideally, residuals are white noise: the model has extracted everything predictable, and what’s left is genuinely unknowable randomness. If there’s still structure in the residuals, the model missed something — and the residuals ACF tells you exactly what.

Person writing notes with a pen, focusing on ideas and creativity. — **Figure 6.** Three residuals ACF scenarios. Left: all bars inside the band; the model is healthy. Middle: lags 1 and 2 outside the band; consecutive errors are correlated, meaning short-term dependency wasn’t captured. Right: lag 12 outside the band; the model makes the same calendar mistake every year, meaning seasonality wasn’t correctly modelled.

Lags 1–2 broken means short-term dependency wasn’t captured. Lag 12 broken means seasonality wasn’t captured. All clear means the residuals are what they should be, unpredictable noise.

The ACF is one of those tools that looks like a technicality until you understand what it’s actually doing. It’s not a black box, it’s a direct visual summary of the question “how much does this series remember about itself, and at what timescales?” Once that question is clear, the chart reads itself.

How to Read an ACF Plot

Starting Point: What Is a Lag Plot?

The ACF: One Number Per Lag

The Confidence Bands

Reading the ACF: A Pattern Guide

What Individual Lags Tell You

The Residuals ACF: An X-Ray for Your Model

Like this:

Related

Leave a ReplyCancel reply

Starting Point: What Is a Lag Plot?

The ACF: One Number Per Lag

The Confidence Bands

Reading the ACF: A Pattern Guide

What Individual Lags Tell You

The Residuals ACF: An X-Ray for Your Model

Share:

Like this:

Related

Leave a ReplyCancel reply

Discover more from Writing my way through ideas.