The first goodness of fit test you will perform is the K-S test. The K-S test statistic is the maximum difference between the empirical distribution function and the analytical cumulative distribution function for the given probability model. 

  • Copy the sheet titled Data and name the new sheet K-S Test.
  • Compute the empirical CDF using the following equation: \frac{i}{N} where i is the rank of the data and N is the number of data points.
  • Compute the CDF of the normal distribution using the Excel function NORM.DIST. The following arguments need to be provided to the function:
    • The base 10 logarithm of discharge
    • The mean of the base 10 logarithm of discharge
    • The standard deviation of the base 10 logarithm of discharge
    • Cumulative: TRUE

The Excel sheet should look similar to the figure below. Make sure that you select log10(Q), rather than Q, as the first argument in the NORM.DIST function!

Computation of CDF values for log10-normally distributed discharges


Question: What is the computed K-S test statistic? 

Using the ABS function in Excel, compute the absolute value of the difference between the empirical and analytical CDF values for each discharge. Next, use the MAX function to find the maximum difference between the empirical and analytical CDF values. This is the K-S test statistic.  The analytical and empirical cumulative distribution functions are shown below. The maximum difference between the 2 functions is indicated by black arrows.

Analytical and empirical CDFs for Point of Rocks discharge data fit to log10-normal distribution

The computed K-S test statistic value is 0.072. To compare with the value computed by HEC-SSP, double click Distribution Fitting Test 20. Select the Analysis tab and click the Goodness of Fit Summary Statistics button.

Distribution Fitting Test 20 K-S test statistic

The computed K-S test statistic matches the value from Distribution Fitting Test 20 in the HEC-SSP Examples.

Continue to Task 3. Chi-Squared Test.