Glossary term
Autoregressive Integrated Moving Average (ARIMA)
An autoregressive integrated moving average model is a time-series forecasting model that combines autoregression, differencing, and moving-average error terms.
Updated
Read time
What Is an Autoregressive Integrated Moving Average (ARIMA) Model?
An autoregressive integrated moving average model, or ARIMA model, is a time-series forecasting model that combines autoregression, differencing, and moving-average error terms. It is used to model patterns in data measured over time.
ARIMA is common in economics, finance, operations, demand forecasting, and risk analysis. It can be useful when a series has persistence, trend, or autocorrelation, but it must be specified carefully and tested against real forecasting performance.
Key Takeaways
- ARIMA models time-series data using autoregressive, integrated, and moving-average components.
- The model is commonly written as ARIMA(p, d, q).
- p represents autoregressive lags, d represents differencing, and q represents moving-average terms.
- Differencing helps make a nonstationary series more stable before modeling.
- ARIMA is useful for forecasting, but it is not a substitute for economic judgment or data diagnostics.
How ARIMA Works
The autoregressive component uses past values of the series to explain the current value. The integrated component differences the data, often by subtracting the prior period’s value, to reduce trend or nonstationarity. The moving-average component uses past forecast errors to improve the model.
The notation ARIMA(p, d, q) summarizes those choices. A model with p equal to 2 uses two autoregressive lags. A model with d equal to 1 differences the series once. A model with q equal to 1 includes one moving-average error term.
Where It Shows Up
ARIMA models are used to forecast sales, inflation, interest rates, call volume, energy demand, inventory needs, web traffic, and other time-indexed series. In finance, the method may be used as a benchmark forecast or as part of a larger risk, planning, or anomaly-detection process.
The model works best when the future resembles the statistical structure of the past. It can struggle when a series is affected by regime changes, policy shocks, product launches, pandemics, accounting changes, or structural breaks.
ARIMA Versus ARMA and ARIMAX
Model | What changes |
|---|---|
ARMA | Uses autoregressive and moving-average terms for a stationary series |
ARIMA | Adds differencing to handle nonstationary behavior |
ARIMAX | Adds external explanatory variables |
That distinction matters for business forecasting. A pure ARIMA model uses the history of the target series. If outside drivers such as prices, promotions, interest rates, or weather matter, an ARIMAX or dynamic regression approach may be more appropriate.
Model Selection
Choosing p, d, and q is not just a mechanical exercise. Analysts often inspect plots, autocorrelation, partial autocorrelation, residual behavior, and out-of-sample forecast errors. Information criteria can help compare models, but the final model should also make practical sense.
Overfitting is a common risk. A model with too many parameters can match historical noise and perform poorly on new data. A simpler model that forecasts reliably is often more useful than a complex model that only explains the past.
Forecasting Discipline
A useful ARIMA forecast should be tested against simple alternatives such as a naive forecast, seasonal naive forecast, moving average, or exponential smoothing model. If ARIMA cannot improve on a simpler benchmark, the model may be adding complexity without adding reliability.
Residual checks are also important. If the residuals still show autocorrelation, seasonality, changing variance, or obvious structure, the model has not captured key information in the series. The forecast may look precise while still missing the pattern that matters. It also benefits from clear documentation of how missing values, outliers, and seasonal patterns were handled before the forecast was trusted.
How to Read It
ARIMA is best understood as a disciplined forecasting tool for time-ordered data, not a causal explanation by itself. It can describe momentum, persistence, and error patterns, but it does not automatically explain why the series moved. Strong use combines statistical diagnostics with business or economic context.