The WBSusquehanna_Correlation example demonstrates the usage of the Correlation Analysis in order to determine how closely related annual maximum peak discharges at several stream gages are to one another.

Input Data

In this example, time series representing annual maximum inflow to four multi-purpose reservoir projects within the West Branch Susquehanna River watershed are analyzed.  The first time series is from the West Branch Susquehanna River at Bower, PA stream gage, which is located just upstream of Curwensville Dam.  The second time series is from the First Fork Sinnemahoning Creek at Wharton, PA stream gage, which is located just upstream of George B. Stevenson Dam.  The third time series is from the Kettle Creek at Cross Fork, PA stream gage, which is located just upstream of Alvin R. Bush Dam.  The fourth time series is from the Bald Eagle Creek bl Spring Creek at Milesburg, PA stream gage, which is located just upstream of Foster Joseph Sayers Dam.  The location of these stream gages is shown in Figure 1.  The time series are plotted within Figure 2 and tabulated within Table 1.

Figure 1. Stream Gage Locations

Figure 2. Input Time Series for WBSusquehanna_Correlation Example

Table 1. Input Time Series for WBSusquehanna_Correlation Example

DateFlow (cfs)
BowerWhartonCross ForkMilesburg
31 May 188927000


09 Nov 19135900


07 Jan 19158640


03 Jun 19169600


22 Jan 19176800


20 Feb 191812200


31 Oct 19186680


12 Mar 192012200


08 Aug 19217300


29 Nov 19219320


13 May 19236850


29 Jun 192411200


09 Feb 19256960


05 Sep 192610500


22 Jan 19279810


30 Mar 19287940


26 Feb 19297440


26 Feb 19306550


04 Apr 19315500


01 Apr 19324160


15 Mar 19336930


30 Sep 19343540


25 Jul 19357120


18 Mar 193631500
20000
22 Jan 19377520


18 Dec 19379490


15 Feb 19394320


31 Mar 194012700


06 Apr 1941

2370
05 Jun 19414600


09 Mar 19426320


22 May 1942

6100
30 Dec 194214300
3660
17 Mar 19445500


08 May 1944

2580
07 Mar 194511200


18 Mar 1945

3520
28 May 1946

11000
13 Jun 19468000


06 Apr 1947

2470
26 Apr 19473660


14 Apr 194810900


15 Apr 1948

2800
27 Jan 19494600


16 Feb 1949

1260
28 Mar 19506850


05 Apr 1950

4180
05 Nov 19508410


25 Nov 1950

12400
27 Jan 195211400


06 Apr 1952

2850
24 Mar 1953

4050
31 May 19538000


02 Mar 195410200
3670
16 Oct 19549270


05 Mar 1955

2460
08 Mar 1956

5140
03 Jul 19569720


06 Aug 1956


4000
06 Apr 19577600
1620
09 Apr 1957


4000
21 Dec 1957

2850
26 Dec 1957


2810
08 May 19585190


22 Jan 195911900
3550
10 Feb 1959


4880
31 Mar 19609270
3920
23 May 1960


4770
25 Feb 196110200


26 Feb 1961

76006340
28 Feb 19624740


31 Mar 1962

3670
07 Apr 1962


4400
17 Mar 1963


2410
18 Mar 19637600


27 Mar 1963

7200
10 Mar 196415200
76008950
03 Jan 19655720


13 Feb 1965

1240
24 Mar 1965


1640
13 Feb 1966


5910
14 Feb 196610600
3400
06 Mar 19677800


29 Mar 1967

2080
29 Sep 1967


5620
25 Oct 1967


3630
31 Jan 19685250


23 Mar 1968

1960
30 Jan 19694590


20 May 1969

4770
23 Jul 1969


1990
02 Apr 197010100
26106530
16 Nov 1970

2600
21 Feb 19715920


27 Feb 1971


4330
23 Jun 197227500
1430021300
02 Feb 19736270
45406290
19 Jan 19744950


04 Apr 1974

20904400
24 Feb 197512700


26 Sep 1975

13400
27 Sep 1975


10900
17 Feb 19769270


20 Jun 1976

4230
21 Jun 1976


5620
09 Oct 1976


6920
20 Jul 197719200


25 Sep 1977

2700
15 Mar 197811400


14 May 1978

3590
15 May 1978


6500
05 Mar 197912100


06 Mar 1979

44307760
26 Nov 19796020
36405780
20 Feb 1981

4360
21 Feb 19818820


23 Feb 1981


5500
28 Oct 1981

3210
13 Mar 19826820


05 Jun 1982


5180
21 Mar 1983


3310
22 Mar 1983

1990
21 Jun 19835790


14 Feb 1984
9810908015000
15 Feb 19848790


25 Feb 1985
27901730
29 Mar 19858140


01 Apr 1985


2630
16 Nov 19859210


20 Jan 1986
4880

15 Mar 1986

32806730
27 Nov 1986
27703680
05 Apr 19876120

3210
02 Feb 19888620

3200
19 May 1988
35103220
31 Mar 198910100


21 Jun 1989
719037705590
11 Apr 1990
36202010
09 Jun 1990


3450
12 Jul 19907690


11 Oct 1990

3370
30 Dec 19907730


04 Mar 1991
5830
4740
03 Dec 1991


2790
14 Jul 19925920


15 Jul 1992
4390

17 Jul 1992

3240
28 Mar 19936020


01 Apr 1993
51604760
16 Apr 1993


11400
28 Nov 1993
6870
11900
25 Mar 19949150


18 Aug 1994

6530
28 Nov 19944320


20 Jan 1995


3880
21 Jan 1995
28602740
19 Jan 1996
15400982016800
19 Jul 199622000


08 Nov 19966000
79607870
09 Nov 1996
5400

08 Nov 199715900


08 Jan 1998
51004280
09 Apr 1998


5380
24 Jan 19998960489030003650
26 Nov 199941002420

04 Apr 2000

1730
17 Apr 2000


3060
18 Oct 20003730


22 Mar 2001


2000
10 Apr 2001
22401640
26 Mar 20025880

5840
13 May 2002
53402480
02 Jan 20036640


21 Mar 2003

2550
22 Mar 2003
3980

28 Sep 2003


6630
18 Sep 2004171008590888021100
06 Jan 200512300


14 Jan 2005
3500

29 Mar 2005


4490
03 Apr 2005

3280
29 Nov 20055990


30 Nov 2005
608047508380
16 Nov 2006


4370
15 Mar 2007548052303070
06 Feb 2008
63304570
05 Mar 20086310

7070
25 Dec 20087260


09 Mar 2009
60204380
31 Jul 2009


3340
25 Jan 2010
1090049406830
14 Mar 20107600


01 Dec 2010
13200741011400
10 Sep 201111200


23 Nov 2011


4280
23 Dec 2011
17701530
27 Jan 20125540


31 Jan 20135770477031205190
22 Dec 2013
4250

13 Mar 20146450


16 May 2014


5140
16 May 2014

3650
15 Mar 20156310


10 Apr 2015

1840
18 Jul 2015


5300
30 Sep 2015
6860

29 Oct 2015
2080

11 Nov 2015

1350
03 Feb 2016


3040
03 Feb 20162750


21 Oct 2016


11100
13 Jan 2017
4140

06 May 2017

1500
29 May 20179590


13 Jan 2018
4630

10 Sep 201817200


10 Sep 2018


11700
11 Sep 2018

2830
03 Oct 2018


4950
22 Dec 2018

3150
08 Feb 20196550


08 Feb 2019
4140

01 Nov 2019
7220

28 Mar 20208990


29 Mar 2020


4270
01 May 2020

2420
25 Dec 2020


6370
10 May 20217790


19 Aug 2021
12700

23 Sep 2021

5100

General Tab

A Correlation Analysis has been developed for this example. To open the analysis, either double-click on the analysis labeled WBSusquehanna_Correlation from the study explorer or from the Analysis menu select open, then select WBSusquehanna_Correlation from the list of available analyses. When WBSusquehanna_Correlation is opened, the General tab within the Correlation Analysis editor will appear as shown in Figure 3. For this analysis, the Time Series | Coincident Events computational method was selected and four locations were defined. The default Plotting Position formula (Weibull) and Output Frequency Ordinates were left unchanged. No modifications were made to the time window.

Figure 3. General Tab

Location Information Tab

The Location Information tab contains four sub-tabs, one for each of the previously defined locations.

Bower

On the Bower sub-tab, the Log and Exceedance Probability (p, Zp) transformations were selected, as shown in Figure 4.  1000 cfs was entered as the replacement value if/when zeroes or negatives were encountered.  A typical event length of 7 days was also defined.  The default Output Labeling was left unchanged.  Finally, a frequency curve that was fit to the Bower time series was defined using a Log Pearson Type III (LPIII) distribution and the flow-frequency curve was computed.

Figure 4. Bower Location Information Tab

Wharton

On the Wharton sub-tab, the Log and Exceedance Probability (p, Zp) transformations were selected, as shown in Figure 5.  1000 cfs was entered as the replacement value if/when zeroes or negatives were encountered.  A typical event length of 7 days was also defined.  The default Output Labeling was left unchanged.  Finally, a frequency curve that was fit to the Wharton time series was defined using a Log Pearson Type III (LPIII) distribution and the flow-frequency curve was computed.

Figure 5. Wharton Location Information Tab

Cross Fork

On the Cross Fork sub-tab, the Log and Exceedance Probability (p, Zp) transformations were selected, as shown in Figure 6.  1000 cfs was entered as the replacement value if/when zeroes or negatives were encountered.  A typical event length of 7 days was also defined.  The default Output Labeling was left unchanged.  Finally, a frequency curve that was fit to the Cross Fork time series was defined using a Log Pearson Type III (LPIII) distribution and the flow-frequency curve was computed.

Figure 6. CrossFork Location Information Tab

Milesburg

On the Milesburg sub-tab, the Log and Exceedance Probability (p, Zp) transformations were selected, as shown in Figure 7.  1000 cfs was entered as the replacement value if/when zeroes or negatives were encountered.  A typical event length of 7 days was also defined.  The default Output Labeling was left unchanged.  Finally, a frequency curve that was fit to the Milesburg time series was defined using a Log Pearson Type III (LPIII) distribution and the flow-frequency curve was computed.

Figure 7. Milesburg Location Information Tab

Each of the previously mentioned time series and frequency curves can be plotted by pressing the Plot All Input Frequency Data button, as shown in Figure 8.

Figure 8. Plot All Input Frequency Data

Computing the Analysis

Once all of the General and Location Information details have been selected and/or defined, the user can press the Compute button to perform the analysis. A Compute Warnings message will be shown noting that some values for each pair of time series were removed from consideration within the correlation calculations.  For instance, when comparing the Bower and Wharton time series, 98 values were removed from consideration.  This was partly due to the fact that the Bower time series begins in 1889 while the Wharton time series begins in 1984.  However, several values within the Bower time series could not be found within the Typical Event Length of 7 days, which was defined on the Location Information sub-tab, when compared against the Wharton time series.  This is visualized within Figure 9, these values for the 1994 water year were removed from the analysis since the measurements were not recorded within 7 days of one another.

Figure 9. Example of Events That Were Removed from Consideration

Only events like those shown within Figure 10, which both fell within the Typical Event Length of 7 days were used within the correlation computations.

Figure 9. Example of Events That Were Considered in the Correlation Computations

Once the computations have been completed, a message window will open stating Compute Complete.

Results Tab

Upon a successful compute, the Results tab will become selectable.  The Results tab contains three sub-tabs, one for each of the previously defined transformations.  Within each transformation sub-tab, results will be presented consisting of:

  • A Correlation Matrix of computed correlation coefficients,
  • A plot of the selected pair(s),
  • A Statistics table of the entire time series (all values are considered within these statistics), and 
  • An Events table consisting of the overlapping date range and number of values that were considered within the correlation computations for the selected pair(s).

One or more pairs within the correlation matrix can be selected.  By default, the upper-left most cell will be selected upon navigating to these sub-tabs for the first time, as shown in Figure 11.  However, if more than one cell is selected, additional information will be shown within the plot as well as the Events table, as shown in Figure 12.

Log

The results using the Linear or Log transformation will be shown on this sub-tab, as shown within Figure 11.

Exceedance Probability

The results using the Exceedance Probability transformation will be shown on this sub-tab, as shown within Figure 13.

Figure 13. Exceedance Probability Tab

Standard Normal Deviate

The results using the Standard Normal Deviate transformation will be shown on this sub-tab, as shown within Figure 14.

Figure 14. Standard Normal Deviate Tab

In addition to the tables and plots on each sub-tab, two buttons are available to create summary plots for all data and transformations: Plot all Pairs and Plot all Transformations.  When the Plot all Pairs is clicked, a figure is created that shows the correlation matrix at the top with multiple plots below, one for every pair of time series, as shown in Figure 15.  The results for this plot will be shown for the currently-selected tab.

Figure 15. All Pairs Plot for Standard Normal Deviate Tab

When the Plot all Transformations button is pressed, a figure will be created that has the correlation matrix and one plot containing the data pairs for all the currently selected cells in the correlation matrix for every selected transform.  For example, when the Bower-Wharton and Milesburg-Wharton pairs are selected for this study, the Plot all Transformations button will generate a plot that resembles Figure 16.  

Figure 16. All Transformations Plot for Two Selected Pairs

The Plot all Transformations button will only be active when the Exceedance Probability transform has been selected for at least 2 time series on the Location Information tab and a compute has been successfully completed.  


The correlation coefficients computed for this example are very strong.  Per EM 1110-2-1415, datasets are often considered well correlated when the absolute value of the correlation coefficient is greater than 0.4.  Also, Bulletin 17C recommends the usage of data with high cross correlation when performing record extension, an example of which can be found here.

Report File

In addition to the tabular and graphical results, there is a report file that echoes the input data, selected computational options, and results. To review the report file, press the View Report button at the bottom of the analysis window. When this button is selected a text viewer will open the report file and display it on the screen, as shown in Figure 17. 

Different types and amounts of information will show up in the report file depending on the data and the options that have been selected for the analysis.

Figure 17. Report File