HEC-HMS version 4.11 beta 2 was used to create this tutorial.

You can download a copy of the dataset here - Bald_Eagle_Creek.zip

Introduction

Automated optimization can be a valuable tool, but it should not replace an understanding of the software, modeling methods and assumptions, impacts from individual model parameters, and knowledge of the watershed. A modeler should not rely solely on an automated optimization tool, otherwise unrealistic parameter values or parameter combinations are possible. Understanding the runoff response for a watershed, parameter impacts, and minimum/maximum parameter ranges are required to successfully set up an optimization trial and critically evaluate results. An optimization routine relies on an objective function to evaluate how well the simulated model results meet the modeling goal, which is in general to accurately re-produce a historical event. Objective functions reduce the judgement of the model's performance to a single number. Different objective functions summarize the model performance in different ways, and emphasize different things, but no objective function can re-produce the judgment of an expert modeler.

An HEC-HMS optimization trial is a simulation type that can be used to identify reasonable model parameters which improve the ability of a model application to predict the precipitation runoff response. An optimization trial includes a basin model, meteorologic model, and information about the simulation time window and time step. The trial also includes selections for the objective function, search method, and parameter adjustments. Optimization involves automated parameter adjustments that improves the objective function. The objective functions measure the goodness of fit between the simulated results and observed data. The search method is responsible making parameter adjustment as the trial searches for an optimal parameter set.

The Differential Evolution search method is new within Version 4.9. This new search method is more robust than the univariate gradient and simplex search methods. The Differential Evolution search has more parameter sets that span the parameter space and the search varies in a random method, instead of the deterministic approach the other search methods follow. It is more likely that the Differential Evolution search method will find a global optimum parameter set due to the increased number of parameter sets (referred to as samples within the population) and random search component. 

In this tutorial, you will see different options for applying the new Differential Evolution search method and how to evaluate results from the optimization trial.

Example Watershed

The Bald Eagle Creek watershed was used for this example. Foster Joseph Sayers Dam is located on Bald Eagle Creek in Centre County, PA (as shown below). Sayers dam is about 14 miles upstream of the confluence of Bald Eagle Creek and the West Branch Susquehanna River at Lock Haven, PA. 

Location of the watershed

You will see two basin models when you open the example project. The BaldEagleCk_Sept2004 basin model is the uncalibrated model and the BaldEagleCk_Sept_Calib is the calibrated basin model. The following figure shows how the modeling domain was delineated within HEC-HMS. Generally, subbasins were delineated at stream gage locations. The drainage area upstream of Sayers Dam (336.8 square miles) was divided into three subbasins, SpringCk_S10BaldEagleCk_S40, and BaldEagleCk_S30. There is observed flow at five locations in the basin model.  

Example HEC-HMS project

The September 2004 flood event was selected for this example. The figure below shows preliminary results and observed flow at the SpringCk_gage and SayersDam_Inflow junctions in the BaldEagleCk_Sept2004 basin model (Sept2004 simulation run). Results were generated by an uncalibrated model. Based on differences in the computed flow magnitude, time, and shape, the loss, transform and baseflow parameters needed to be adjusted from initial estimates. 

Uncalibrated model results

The figure below shows modeled and computed results from the BaldEagleCk_Sept2004_Calib basin model after manual calibration (Sept2004 Calibrate simulation run). Parameter values were adjusted within reason to improve the simulated peak flow, total runoff volume and time of peak flow. 

Calibrated model results

The figure below contains the calibration summaries for both the Sept2004 and Sept2004 Calibrate simulation runs. Model performance is in the very good range after the model was calibrated to the flood event.  

Calibration metrics for the uncalibrated and calibrated models

The following table contains the initial parameter estimates and final parameter values after manual calibration of the SpringCk_S10 subbasin. Notice the initial loss and initial baseflow values did not change. Initial states were determined by running a few iterations of the model and setting the values so model results matched initial baseflow and initiation of runoff. There are parameter interactions as some parameter impact similar aspects of the runoff response. For example, the constant loss rate, Clark storage coefficient, and some groundwater parameters impact the magnitude of the computed hydrograph. The model can be simplified by relating model parameter to one another. For this example, the GW 1 coefficient was set to be equal to the Clark storage coefficient. The GW 1 Reservoir parameter (number of linear reservoirs in series with the same GW 1 coefficient) was adjusted to control the magnitude and timing of the baseflow hydrograph. 


SpringCk_S10 - BaldEagleCk_Sept2004SpringCk_S10 - BaldEagleCk_Sept2004_Calib
Initial Loss (in)22
Constant Rate (in/hr)0.10.38
Impervious (%)2.32.3
Time of Concentration (hr)8.0915
Storage Coefficient (hr)8.0914
GW 1 Initial Flow (cfs/mi2)00
GW 1 Fraction0.10.05
GW 1 Coefficient (hr)8.0914
GW 1 Reservoirs55
GW 2 Initial Flow (cfs/mi2)22
GW 2 Fraction0.10.1
GW 2 Coefficient (hr)100100
GW 2 Reservoirs11

The Differential Evolution Search Method

The HEC-HMS optimization tool is meant to be an aid in the calibration process. The modeler should be aware of parameter impacts prior to using this tool. For example, you should understand what happens when the constant loss rate parameter is increased (peak flow should be reduced). The modeler should also have a general idea of reasonable parameter ranges, credible minimum and maximum values. Subbasin size, slope, land use, and soil information can be used to bound minimum and maximum parameter values. 

As shown in the figure below, you should see four Optimization Trials when you open the project. The first three trials (their name begins with DE) are configured to use the Differential Evolution search method. Different Objective Functions were selected for the three Optimization Trials configured to use the Differential Evolution search method. The DE_NS_Sept2004 Optimization Trial uses the Normalized Nash Sutcliffe objective function, the DE_PeakRMSE_Sept2004 Optimization Trial uses the Peak-Weighted RMSE objective function, and the DE_SSR_Sept2004 Optimization Trial uses the Sum of Squared Residuals objective function. 

Each objective function is unique and was designed to emphasize different aspects when comparing model results to observations. For example, some objective functions emphasize larger flow values while others evaluate model performance with respect to time and volume. The following summary provides additional information about the three objective functions chosen for the tutorial. 

  • Normalized Nash Sutcliffe - The Nash-Sutcliffe Efficiency (NSE) is a performance measure that compares the variance of the modeled residuals to the variance of the observed flows.  It can be thought of as a relative comparison of the noise of the modeled time series to the "signal" contained in the observed time series. The normalized version of the NSE metric re-scales it to be on the range [0,1] where 1 indicates perfect prediction and 0.5 indicates that using the mean of the observed flows is as good of a predictor of the observed time series as the modeled time series.  0 indicates there is no information about the observed time series in the modeled one.  The normalized statistic is preferred for optimization routines because the normal lower bound of -\infty for NSE can cause problems with the search.  Because higher values of NNSE correspond to better model performance, the NNSE statistic must be maximized when searching for optimal model parameters.
  • Peak-Weighted RMSE - The peak-weighted root mean square error is a modification of the root mean square error (RMSE) measure that gives more weight to errors that correspond to larger observed flows.  The RMSE measure computes the square-root of the average of the squared model residuals, where the residuals are the difference between the simulated and observed values for a time step.  The peak-weighted version multiplies each residual by a weight that increases proportional to the magnitude of the observed flow for that time step.  Smaller values of peak-weighted RMSE indicate better model performance, so it must be used with the minimization goal in an optimization trial.
  • Sum of Squared Residuals - The sum of squared residuals objective function (sometimes also called the sum of squared errors, similar to linear regression, and has the same interpretation) simply computes the model residual for each time step, squares them, and then adds them all up.  It does not have any special weight for any of the residuals.  It is related to the regular root mean square error (RMSE) objective function; if you computed the sum of squared residuals, divided by the number of time steps in the model simulation, and then took the square root, you would get the model RMSE.  For long-term hydrologic simulation the sum of squared errors can place too much emphasis on periods of low flow or baseflow.  Additionally, magnitudes of this objective function are proportional to the length of the simulation time window and can become very large.  Smaller values indicate better model performance and this objective function must be used with the minimization goal.
Objective FunctionFormula
Normalized Nash-Sutcliffe

NSE=1-\frac{\sum_{i=1}^{n}(q_{i}^{m}-q_{i}^{o})^{2}}{\sum_{i=1}^{n}(q_{i}^{o}-\overline{q^{o}})^{2}}NNSE = \frac{1}{2-NSE}

Peak-Weighted RMSE

\sqrt{\frac{1}{n}\sum_{i=1}^{n}(\frac{q_{i}^{o}+\overline{q^{o}}}{2\overline{q^{o}}})(q_{i}^{m}-q_{i}^{o})^{2}}

Sum of Squared Residuals

\sum_{i=1}^{n}(q_{i}^{m}-q_{i}^{o})^{2}

q_{i}^{m} is the modeled flow for timestep iq_{i}^{o} is the observed flow for timestep i, \overline{q^{o}} is the average observed flow for the objective function time window, and n is the number of model timesteps in the objective function window.

Optimization Trials in Watershed Explorer

The Differential Evolution search method performs a much more robust search than the Simplex option. The Differential Evolution search method includes an option for the modeler to specify the Population Size as shown below. The Population Size determines the number of parameter sets that traverse the parameter space. In the example shown below, the Population Size is set to 30 and the number of parameters adjusted within a parameter set is set to five. In this example, there are 30 parameter sets, each parameter set includes a unique combination of the five parameter values. The 30 parameter sets move around the parameter space while searching for the optimal parameter set. The variation in the value of the objective function for each of the 30 parameter sets in an iteration is used to assess whether or not the search has converged. The Differential Evolution search methods contains logic that determines how the parameter sets move around the parameter space and when the search stops based on the user defined Max Iterations and Tolerance. The Seed Value determines the initial state of the random number generator used in the DE algorithm used for generating new population members.  It is provided so that DE trials are repeatable.

The figure below shows one of the Parameter Component Editors within an Optimization Trial. The Initial Value can be either the default value from the Basin Model or the user can enter a different value. All Optimization Trials in the example project have the same initial values specified for similar parameters. The Initial Value has little to no meaning within a Differential Evolution search since parameter sets start with random initial parameter values that span the parameter space.

Initially, the parameter Minimum and Maximum values are set to the default allowable range in HEC-HMS. For example, the Clark Storage Coefficient has a minimum allowable value of 0.02 hours and a maximum allowable value of 1000 hours.  Basing the parameter search range on the allowable range may result in a model that takes prohibitively long to converge, or may produce unexpected results. It is highly recommended that the minimum and maximum values are edited based on physical characteristics in the watershed. As shown below, the Minimum Clark Storage Coefficient was set to 3 hours and the Maximum Clark Storage Coefficient was set to 25 hours. The same Clark Storage Coefficient Minimum and Maximum values were set for all Optimization Trials in the example project. 

Parameter 1 Minimum and Maximum values

Iterations are treated differently between the Simplex and Differential Evolution search methods. An iteration is one step in the solution process and an evaluation is a computation of the objective function. The evaluation occurs after the model is computed using a unique parameter set. For the Simplex search method, the first iteration has n+1 (n is the number of parameters being adjusted in the trial) evaluations and then iterations after that all have one evaluation. The figure below shows a simple Simplex, there are two parameters (Time of Concentration and Storage Coefficient) and the Simplex has three nodes. A separate simulation is computed for each parameter combination that makes up the Simplex. The objective function is computed for each of the three simulations. Then the simplex starts moving where only one parameter set is changed, a simulation is computed, and then the new objective function is computed. The new objective function is compared to the objective functions from the existing two nodes and the Simplex determines which node to move for the next iteration.  More information on this method can be found here.

Example Simplex for two parameters

The Population Size is critical for the Differential Evolution search method. It is recommended that a Population Size of 30 is used for most applications. A large enough Population Size is important for ensuring the search covers the full range of plausible parameters, and also for assessing whether or not the search has converged. For the Differential Evolution search method, every iteration has 30 parameter sets (assuming a Population Size of 30), and includes 30 model simulations and then 30 evaluations (computation of the objective function).  The Differential Evolution search methods will move the parameter sets around in order to reduce the average objective function using evaluations from all 30 simulations. The figure below shows how the 30 parameter sets might look at the very beginning of a Differential Evolution search (assuming only two parameters being adjusted). The parameter sets will eventually converge through the search and evaluation process.  More information on this method can be found here.

Example showing 30 Differential Evolution parameter sets

Run times will likely be longer for Optimization Trials that use the Differential Evolution search method. The table below contains the run time for each of the four optimization trials in the example project. Run times for trials that use the Differential Evolution search method will vary since part of the parameter search is a random process. Notice the run time for the trial that uses the Simplex search method is much shorter than those trials that use the Differential Evolution search method.

Optimization TrialRun Time (seconds)Number of Evaluations
DE_NS_Sept20046629 iterations x 30 population size = 870
DE_PeakRMSE_Sept200412958 x 30 = 1740
DE_SSR_Sept200410449 x 30 = 1470
Simplex_NS_Sept20042399

Results

The figure below shows available results for the DE_NS_Sept2004 Optimization Trial. There are summary tables with information about the trial and optimal parameters, plots showing the objective function value and parameter value for each iteration, and time-series results for basin model elements upstream of the location with observed flow. The time-series results are only for the very last iteration, which includes the final, or optimal, parameter values. 

Results for the Differential Evolution trial

The Optimized Parameters table, shown below, contains the final/optimal parameter values. The initial values are included but do not reflect the starting point for a Differential Evolution search. As mentioned above, the program will randomly identify parameter sets (based on the user defined population size) and then evolve the parameter sets. The Optimized Values in the summary table is the final parameter set that minimized or maximized the objective function. 

The figure below shows how the Objective Function varied for each iteration. The plot below shows each evaluation of the objective function for an iteration (for DE with a population size of 30 there are 30 evaluations per iteration.) A value of 1 for the NNSE metric identifies a perfect model, one that is able to exactly reproduce the observed results. In this example, the final parameter set generated an NNSE value of 0.987.

The figure below shows how the Time of Concentration parameter varied for the best parameter set across all evaluations. The search method spanned the user defined parameter space before converging on a Time of Concentration value of about 15 hours. 

The figure below is the summary plot for the SpringCk_gage junction element. You can see the computed (blue) and observed (black) hydrograph. Compared to results from the uncalibrated model, the Optimization Trial was able to calibrate the model for the single flood event. 

Simulated and Observed flow from the calibrated model

The following table contains initial parameter values, results from the three Differential Evolution Optimization Trials, and results from the Simplex Optimization Trial. There is a little variability in the model parameters across the Optimization Trials. 

Parameter Initial Parameter EstimateDE_NS_Sept2004DE_PeakRMSE_Sept2004DE_SSR_Sept2004Simplex_NS_Sept2004
Objective Function
Normalized Nash SutcliffePeak-Weighted RMSESum of Squared ResidualsNash Sutcliffe
Constant Rate (in/hr)0.10.500.500.500.50
Time of Concentration (hr)8.0915.815.8915.8516.11
Storage Coefficient (hr)8.0912.512.2712.4711.92
GW 1 Coefficient (hr)8.093.017.9417.7816.09
GW 1 Reservoirs54555
Final Nash Sutcliffe Score-5.230.9870.9880.9880.987

Example Application 

This section of the tutorial provides a step-by-step procedure for creating a new Optimization Trial within the example project. The uncalibrated basin model will be used for the example. An iterative approach is presented that attempts to simplify the model, due to parameter interactions. 

  1. Open the Create an Optimization Trial wizard by selecting the Compute | Create Compute | Optimization Trial menu options.
  2. Name the new Optimization Trial Example_Trial and click the Next button.
    Enter a name for the new Optimization Trial
  3. Select the BaldEagleCk_Sept2004 Basin Model and click the Next button. 
  4. Select the Sept2004 Meteorologic Model and click the Next button. 
  5. The new trial is added to the Optimization Trials folder on the Compute tab of the Watershed Explorer. 
    Trial is added to the Optimization Trials folder
  6. Select the new Optimization Trial to open the Component Editor. Enter a Start Date of 17Sep2004, Start Time of 06:00, End Date of 20Sep2004, and an End Time of 00:00. Select a Time Interval of 15 Minutes. 
    Configure the Trial's time window and time step
  7. Go to the Search tab and choose the Differential Evolution search method. Set the Max Iterations to 100, Tolerance to 0.001, and Population Size to 30. Utilizing an optimization trial is an iterative process. You might find that the search converges too quickly or not at all. If this does happen, the Max Iterations should be increased and/or the Tolerance decreased. The Tolerance is an absolute value, and it is applied differently for the selected search method and objective function. For the Simplex search method, the search will stop when the change in objective function from one iteration to the next is less than the user defined tolerance. For the Differential Evolution search method, the search will stop when the standard deviation of the objective functions from all parameter sets (usually 30) is less than the tolerance plus the tolerance times the mean objective function. The Population Size determines the number of parameter sets that are traversing the parameter space. A higher population size will result in longer compute times but should aid in the search finding an global optimum parameter set. 
  8. Go to the Objective tab and choose a Goal of Minimization. The Location should be the SpringCk_ gage element (this element has observed flow). Select the Discharge Time-Series. The Statistic is the objective function used to evaluate the model's performance (using the simulated and observed flow). Choose the Peak-Weighted RMSE objective function. Make sure the No Transform option is selected for Data Transformation. Then enter a Start Date of 17Sep2004, a Start Time of 06:00, an End Date of 20Sep2004, and an End Time of 00:00. All computed and observed flow values within this time window are used to compute the objective function. 
    Set the Trials objective function
  9. The next step is to add Parameters to the Optimization Trial. Place the mouse pointer on top of the Trial's name in the Watershed Explorer. Click the right mouse button and choose the Add Parameter option in the pop-up menu. Add five parameters to the trial. 
    Add a parameter to the trial
  10. The figure below shows the Component Editor for Parameter 1. The Element should be set to SpringCk_S10. Once you select this element, the program will build the list of parameters that can be chosen. Select the Clark Unit Hydrograph - Storage Coefficient parameter. The Initial Value is automatically populated using the value from the Basin Model. Make sure Locked is set to No. A value of No means the program will adjust the parameter value during the search. A value of Yes means the program does not vary the parameter during the search. Always modify the Minimum and Maximum values, do not run an Optimization Trial with the default values. Enter a Minimum value of 3 hours and a Maximum value of 25 hours. These limits are appropriate for the watershed. 
    Edit the minimum and maximum values
  11. Set Parameter 2 to the Clark Unit Hydrograph - Time of Concentration and enter a Minimum of 3 hours and a Maximum of 25 hours. 
  12. Set Parameter 3 to the Initial and Constant - Constant Rate and enter a Minimum of 0.05 inches/hr and a Maximum of 0.5 inches/hr. 
  13. Set Parameter 4 to the Linear Reservoir - GW 1 Coefficient (1) and enter a Minimum of 3 hours and a Maximum of 25 hours.
  14. Set Parameter 5 to the Linear Reservoir - GW 1 Reservoirs (1) and enter a Minimum of 2 and a Maximum of 5.
  15. Run the Optimization Trial, it should take a couple of minutes to complete. Due to the random nature of the Differential Evolution search method, the number of iterations and optimal parameter values will vary each time the trial is computed. 
  16. View results from the optimization trial by selecting the Results tab of the Watershed Explorer, expanding the Optimization Trials folder, and then clicking on the Example_Trial node in the Watershed Explorer. 
    Results for the optimization trial 
    Below is a plot of observed flow and the computed hydrograph from the optimized model. 

    The figure below shows the final parameters values from the optimized basin model. 

    The figure below shows how the Clark Storage Coefficient progressed across evaluations.
  17. Currently, there is no option for linking model parameters within an Optimization Trial. For example, linking the GW 1 Coefficient to the Clark Storage Coefficient so that the values are equal to one another is one way to simplify the model. The GW 1 Reservoirs parameter is another parameter within the baseflow method that can be use to change the baseflow response. It is not necessary to adjust both the GW 1 Coefficient and the GW 1 Reservoirs. Locking the GW Coefficient and only adjusting the number of linear reservoirs is a way to simplify the calibration process. Make the following modifications to the Example_Trial optimization trial. You will be locking three of the five parameters to those optimal parameter from the completed trial. The Optimization Trial will be run a second time while only the Constant Loss Rate and GW 1 Reservoirs are adjusted. 
    1. Clark Unit Hydrograph - Storage Coefficient - Set the Initial Value to 12.3 hours and set the Locked option to Yes.
      Using the Lock option
    2. Clark Unit Hydrograph - Time of Concentration - Set the Initial Value to 15.86 hours and set the Locked option to Yes.
    3. Linear Reservoir - GW 1 Coefficient (1) - Set the Initial Value to 12.3 hours and set the Locked option to Yes.
  18. Run the Example_Trial Optimization Trial a second time. The trial should take a couple of minutes to complete. The following figures show update optimal parameter values and the observed and final computed hydrographs. The optimal Constant Loss Rate is 0.5 inches per hour, which is at the upper end of the reasonable parameter range for this subbasin. Quality of the boundary condition data is always something to consider when modeling historical events and is one reason why it is a good practice to identify reasonable parameter ranges within the optimization trial.