Goodness of fit tests are intended to inform the user if there are large deviations in the data away from the selected and calibrated probability model. The tests are not an indication that the data necessarily come from the selected probability distribution, just that the data do not significantly deviate from it. The choice of a probability distribution and fitting method to model data should not be based solely on the results of comparing goodness of fit tests; there should always be a mathematical basis for selecting a probability distribution as a model for data. The tests outlined below may help rule out one or more distributions when multiple are valid for the data being modeled.

Kolmogorov-Smirnov Test

The Kolmogorov-Smirnov, or K-S Test, is a nonparametric method for checking equality of two continuous probability distributions. When the distribution of data is approximated with an empirical distribution, equality can be checked between the empirical distribution and an alternative model for the data. The K-S Test behaves by finding the maximum difference in CDF between the proposed model for the data and the empirical distribution of the data. The program will provide a test statistic which is the result of previously-mentioned computations. In practice, if the difference is large based on the sample size, the null hypothesis that the data come from the proposed model would be rejected.

Chi-Squared Test

The Chi-Squared Test (more specifically, Pearson's Chi-Squared Test) is a parametric goodness of fit test. The test behaves by creating a number of discrete classes or "bins" for the data, and comparing the observed proportion of the data in each bin compared to the expected proportion of the data according to the model. Similar to the K-S test, the program will provide a test statistic which is the result of previously-mentioned computations. In practice, if the proportions are significantly different, then the null hypothesis that the data arise from the proposed model would be rejected. The name for the test comes from the distribution of the differences of the proportion, which follow the Chi-Squared Distribution. The critical value for rejection can be computed from a Chi-Squared Distribution with k – 1 degrees of freedom, where k is the number of bins used in the test.

Anderson-Darling Test

The Anderson-Darling Test, or A-D Test, is a special case of the K-S test. The A-D test gives more weight to the tails of the distribution than the K-S test.

Bayesian Information Criterion

The Bayesian Information Criterion, or BIC, is a measure of model parsimony. BIC evaluates the tradeoff between goodness of fit of the model and the simplicity of the model. BIC uses the maximum of the likelihood function of the probability model. When fitting a distribution, the likelihood can be increased by adding parameters. However, this can result in overfitting. BIC penalizes overfitting of the data by adding a penalty for each parameter in the probability model. Therefore, BIC favors probability models with less parameters.

Akaike Information Criterion

The Akaike Information Criterion, or AIC, is closely related to BIC. Similar to BIC, AIC penalizes overfitting by adding a penalty for each additional parameter. However, AIC has a weaker penalty per parameter than BIC and favors probability models with more parameters.

Download PDF

Goodness of Fit Tests

Kolmogorov-Smirnov Test

Chi-Squared Test

Anderson-Darling Test

Bayesian Information Criterion

Akaike Information Criterion