When we look at a time series, the raw data often hides the underlying pattern. Short-term fluctuations make it hard to see whether the series is actually rising, falling, or simply noisy.
A moving average is one of the simplest tools we have to reveal that structure. By averaging nearby observations, it smooths out random variation and makes the long-run movement of the series easier to see.
If you’ve seen moving averages introduced as a simple forecasting tool, trailing averages, weighted averages, window size trade-offs, this earlier post covers that angle. Here the focus shifts: we’re using moving averages not to forecast directly, but to extract the trend-cycle component from a series before decomposition.
The Basic Idea
A moving average of order m is written:
where m = 2k + 1. At each time point t, you average the k observations before it, the observation itself, and the k observations after it. The result is the estimated trend-cycle at that point.
For example, with a 5-MA (m=5, k=2), the smoothed value at time t is the average of , .
The intuition is simple: nearby observations in time tend to have similar values. Averaging them cancels out short-term fluctuations, leaving the slower-moving trend. The larger the window, the more aggressively noise gets suppressed, but also the more the curve lags, and the more end-points get lost (you need k observations on each side to compute the average).
Order Matters: Odd vs Even
Simple moving averages are almost always of odd order; 3, 5, 7, 9. The reason is symmetry. With an odd window of size m = 2k+1, the central observation sits exactly in the middle, with k points on each side. The average is centred on t.
If you used an even window, say m = 4, there’s no natural centre. A 4-MA over is centred between t and t+1, not on either of them. The result is a smoothed value that doesn’t align cleanly with the original time axis. That asymmetry causes problems when you try to subtract the trend from the original data.
Moving Averages of Moving Averages
What if your seasonal period is even like quarterly data (period 4) or monthly data (period 12)?
The solution is to apply two moving averages in sequence. Take a 4-MA first, then apply a 2-MA to the result. This is called a 2ร4-MA.
Why does this work?
The result is a weighted average, symmetric around t, where the central observations carry more weight than the edge ones. Symmetry is restored, and the estimate is properly centred.
The general rule: to make a centred moving average out of an even-order MA, follow it with a 2-MA. The combination is written -MA.
Why Seasonal Period Determines Window Size
When you compute a moving average to estimate the trend, you want to average out the seasonality completely. If your window covers exactly one full seasonal cycle, and all seasons appear equally, the seasonal effects cancel and what remains is pure trend.
For monthly data with annual seasonality (period 12), a 2ร12-MA achieves this. Each month of the year gets equal weight: the first and last terms (which are the same month in adjacent years) each get weight , and every other month gets . Over a full year, all seasonal variation is averaged out.
For quarterly data (period 4), a 2ร4-MA does the same thing, each quarter gets equal weight across the window.
If the window does not match the seasonal cycle, the average will mix seasons unevenly. For example, an 11-month moving average on monthly data will sometimes include two Decembers and sometimes none. The seasonal effects will therefore not cancel, and the smoothed series will still contain seasonal patterns.
The rule: use a -MA where is the seasonal period (for even periods). For odd periods, a simple -MA works directly.
Weighted Moving Averages
The -MA above is a special case of a weighted moving average:
where the weights are not all equal. Two conditions must hold: the weights must sum to 1 (so the average is on the right scale), and they must be symmetric (, so the estimate is centred).
The simple m-MA is the special case where all weights are . The -MA uses weights .
The advantage of unequal weights is smoothness. In a simple moving average, observations enter and leave the window abruptly. A weighted moving average softens this transition by gradually reducing the importance of observations as they move away from the centre.
What Moving Averages Can’t Do
A moving average is a powerful smoother, but it has real limitations:
End-point problem: You lose k observations at each end. For a 12-MA, that’s six months at the start and six at the end โ potentially a year of trend estimates gone. This is why more sophisticated methods (like STL) are preferred for real applications.
Lag: A moving average reacts slowly to genuine changes in direction. If the series takes a sharp turn, the MA will still be averaging across the old trend for several periods. It tells you where the series was, not necessarily where it is.
Fixed window: A simple MA treats all observations in the window equally regardless of how far they are from the centre. More sophisticated methods give more weight to closer observations.
Moving averages are the foundation of classical decomposition, which we cover in the next post. Understanding them here means the mechanics of decomposition will make immediate sense โ because classical decomposition is essentially: extract trend with a moving average, then work out seasonality from what’s left.
Hyndman, R.J. & Athanasopoulos, G. (2021). Forecasting: Principles and Practice, 3rd ed., Chapter 3.1 OTexts. https://otexts.com/fpp3/
