Glossary term

Chi-Square Statistic

A chi-square statistic measures how far observed counts differ from expected counts in categorical data.

Updated

May 25, 2026

Read time

3 min read

What Is a Chi-Square Statistic?

A chi-square statistic measures how far observed counts differ from expected counts in categorical data. It is commonly used in chi-square tests for goodness of fit, independence, and association.

In finance and business analysis, chi-square testing can help evaluate whether observed patterns differ from what a model, historical distribution, or independence assumption would suggest. It is a statistical tool for counts and categories, not a measure of dollar value by itself.

Key Takeaways

  • A chi-square statistic compares observed counts with expected counts.
  • It is used with categorical data.
  • Larger values generally indicate larger differences between observed and expected counts.
  • The statistic is interpreted with degrees of freedom and a chi-square distribution.
  • It can identify unusual patterns, but it does not explain cause by itself.

Formula

A common chi-square statistic is:

χ2=(OiEi)2Ei\chi^2 = \sum \frac{(O_{i} - E_{i})^2}{E_{i}}

In this expression, Oi is the observed count in category i and Ei is the expected count in that category. The calculation adds up the squared difference between observed and expected counts, scaled by the expected count.

If observed counts match expected counts closely, the statistic is small. If they differ sharply, the statistic grows.

Where It Shows Up

A chi-square test can be used to test whether customer choices follow an expected distribution, whether default counts differ by borrower category, whether survey responses are independent of income group, or whether fraud flags are unusually concentrated in one channel.

In portfolio research, analysts may use categorical tests to examine whether events are evenly distributed across regimes, ratings, sectors, or time buckets. The test is not limited to finance, but it can be useful wherever data is grouped into categories.

Goodness of Fit Versus Independence

Use

Question

Goodness of fit

Do observed counts match an expected distribution?

Independence

Are two categorical variables related?

Homogeneity

Do different groups share the same distribution?

The statistic may look similar across uses, but the setup and degrees of freedom differ. Analysts should define the hypothesis before looking at results.

Interpreting the Result

The result is strongest when the expected counts come from a clear model or defensible benchmark. If the expected distribution is arbitrary, the calculation may look precise while testing a weak assumption. In business settings, analysts should be able to explain where the expected values came from before relying on the statistic.

The chi-square statistic is usually compared with a chi-square distribution using degrees of freedom. A p-value can indicate whether the observed difference would be unusual if the baseline hypothesis were true.

A statistically significant result means the observed pattern is unlikely under the stated assumption. It does not prove the reason for the pattern. Business context, data quality, sample design, and model assumptions still matter.

Practical Limits

Chi-square tests need count data and adequate expected counts. Very small expected counts can make the test unreliable. The observations should also be independent under the usual assumptions.

The test can also detect differences that are statistically significant but economically small. A large dataset can make tiny deviations look important. Analysts should ask both statistical and financial questions: is the pattern real, and does it matter?

Example in Business Analysis

Suppose a lender expects defaults to be evenly distributed across four borrower segments, but the observed defaults are heavily concentrated in one segment. A chi-square test can help determine whether that concentration is unusually large relative to the expected pattern. The next step would still be economic investigation: underwriting, borrower mix, data errors, or changing conditions may explain the result.

The Bottom Line

A chi-square statistic compares observed categorical counts with expected counts. It is useful for testing fit and association, but it should be read with degrees of freedom, data quality, sample size, and practical economic meaning.

Related Terms