Disclaimer: The United States Army Corps of Engineers has granted access to these data for instructional purposes only. Do not copy, forward, or release the information without United States Army Corps of Engineers approval.
Initial Workshop Files

In this workshop, you will develop a mapping-type regionalization regression for a precipitation-frequency analysis in the Pacific Northwest. Your goal is to use a number of physical parameters for 76 precipitation stations on the west side of the Cascade Mountains in Washington (see figure below) to predict a statistical parameter that represents the magnitude of variation of the annual maximum rainfall from year to year. It is a parameter using L-moments called “Coefficient of L-variation” and uses the symbol t.

It is common to use multiple linear regression in regional precipitation frequency analyses so that estimates for the statistical properties of extreme rainfall can be estimated anywhere, not just at the sites where rainfall observations occur. For the 76 sites, an annual maximum series (AMS) has been developed for the maximum 72-hour wintertime precipitation, and then its statistical properties estimated. In order to make inferences about the statistical properties of locations where there are not precipitation observations, multiple linear regression is often used to link the statistical properties with physical ones, such as elevation, distance from a coastline, and so on.
You will perform this analysis in four phases:
- Exploration of the properties of the variable to be predicted, as well as the prospective predictors
- Selection of a limited number of predictors
- Construction of the model
- Checking assumptions
The data you receive will have several fields. The ones you will be using in your regression model are in bold.
- ID: the precipitation station ID code
- Name: the precipitation station name
- n: the record length contained in the annual maximum series (AMS)
- l_1: the L-location of the AMS
- t: the L-CV of the AMS, which is the variable you will use as the predictand in the model
- t_3: the L-skewness of the AMS
- t_4: the L-kurtosis of the AMS
- latitude: the latitude of the site in decimal degrees (dd)
- longitude: the longitude of the site in dd
- winter_prcp: the normal wintertime (November, December, January, February) precipitation total from PRISM in mm
- winter_temp: the normal wintertime average temperature from PRISM in Kelvin
- elevation: the elevation of the site from the PRISM elevation model
- dist_to_coast: the distance from the site to the nearest coastline from NASA in km
The tools available to you for the workshop are:
Start the Workshop: Phase 1: Data Exploration Using R