Glossary term
Residual (Statistics)
A residual is the difference between an observed value and the value predicted by a statistical model.
Updated
Read time
What Is a Residual in Statistics?
A residual is the difference between an observed value and the value predicted by a statistical model. In regression analysis, it shows how much the model missed for a specific observation.
Residuals matter because they reveal what the model does not explain. A model may have a strong summary statistic, but residuals can show bias, outliers, changing volatility, nonlinear patterns, or missing variables.
Key Takeaways
- A residual is observed value minus predicted value.
- Residuals show model errors at the observation level.
- Residual analysis helps test whether a regression model is appropriate.
- Patterns in residuals can signal omitted variables, nonlinearity, outliers, or changing variance.
- In finance, residuals are often interpreted as unexplained return or idiosyncratic movement.
The Basic Formula
A residual is commonly written as:
In this expression, ei is the residual for observation i, yi is the observed value, and ŷi is the model's predicted value.
For example, if a model predicts a fund return of 7% and the fund actually returns 5%, the residual is -2 percentage points. That gap is the part of the return this model did not explain.
What Residuals Can Reveal
Residual pattern | Possible signal | Why it matters |
|---|---|---|
Random scatter | Model may fit the basic structure reasonably. | Errors do not show an obvious pattern. |
Curved pattern | Relationship may be nonlinear. | A linear model may be too simple. |
Large outliers | Unusual observations or data issues. | Can distort estimates. |
Changing spread | Variance may not be constant. | Risk estimates and standard errors may be unreliable. |
Financial Interpretation
In investing, residuals often represent the part of return not explained by the model. In a factor model, the residual is the return not explained by market exposure or selected factors. That unexplained portion may reflect security selection, company-specific news, model error, or omitted risk factors.
Residuals are also used in risk models, valuation models, and economic forecasting. If residuals are consistently positive or negative for certain observations, the model may be biased. If residuals become larger during stress, the model may understate risk when it matters most.
Residuals can be more informative than the fitted line. A model can look acceptable on average while repeatedly missing the same type of company, time period, or market regime. Those misses are often where model risk lives.
Where Residuals Can Mislead
A residual is not automatically skill, alpha, fraud, or noise. It is simply unexplained by the model being used. A different model might explain it. A data error might create it. A structural change might make old relationships unreliable.
Residual analysis is strongest when paired with economic reasoning, diagnostics, and out-of-sample testing. It asks whether the model's errors are tolerable for the decision being made, not just whether the model looks tidy on paper.
The Bottom Line
A residual is the gap between what happened and what a model predicted. It is one of the most useful clues for judging whether a statistical model is explaining reality or merely fitting a summary number.