Download PDF
Download page Validating Data Concepts.
Validating Data Concepts
Data retrieved by the CWMS data capture system passes through an automated validation and transformation process. Arriving data, before any revisions are applied, are preserved in the CWMS database, and are identified as raw data. Data that are automatically processed for validation and for transformation into additional data parameters are separately saved to the CWMS database, and are identified as potentially revised data.
To edit or validate data manually, you will accomplish this through the Data Validation Editor. You can edit or validate data from the individual time series icons or through time series datasets. The other way to edit data is through validation lists. A validation list is a way to organize your data in the CWMS database for editing or validation.
Selecting Appropriate Datasets of Validation Lists
When setting up a validation list for manual review of quality and performing potential revisions, it is important to select datasets identified as revised. Prior to 2018 the CWMS standard for identifying the two classes of raw and revised data is to add a "-raw" or "-rev" to the end of the version component of the CWMS dataset description. For example, the raw and revised datasets for a sample time series of stage data are:
Sample Location.Stage.Inst.1Hour.0.GOES-raw
Sample Location.Stage.Inst.1Hour.0.GOES-rev
The raw dataset should never be changed from its original captured value. Therefore, dataset descriptions ending in "-raw" should not appear in the validation lists. However, when you select dataset descriptions ending in "-rev" in your validation lists, the Data Validation Editor will automatically retrieve the "-raw" data as well allowing you to view and compare both the raw and revised data for that dataset.
You should also never include transformed datasets in validation lists. For example, in the CWMS automated transformation process, the sample location stage from the previous example may be used with a rating table lookup to produce a new stream flow record:
Sample Location.Flow.Inst.1Hour.0.GOES-rev
If you edit the flow and store it to the CWMS database, it will probably remain out of sync with the stage dataset. This occurs because transformations are one-way only: i.e., the CWMS transformation process correctly processes a revised stage into a new revised flow by the automated transformation processing, but not vice versa. If transformed values are edited, then the captured data must be manually revised as well, in order to keep them in agreement.
Note: After development of CWMS 3.1.1, Districts began implementing the Standard Naming Convention for their CWMS Time-Series Identifiers.The new convention replaces the "-raw" or "-rev" label appended to the version component, with one of six possible prefixes to the version component. For example:
Sample Location.Stage.Inst.1Hour.0.Raw-GOES
Sample Location.Stage.Inst.1Hour.0.Best-GOES
The Data Validation Editor in CWMS 3.1.1 does not automatically retrieve the "raw" data corresponding to the other Classification Indicators.
Creating Validation Lists – CWMS Database
After reviewing the quality of incoming data, you will have identified locations reporting questionable or invalid data. A validation list is a way to organize your data in the CWMS database for editing or validation. To create a validation list from time series icons in the map window:
- By default, CWMS will select the time series datasets associated with all of the defined time series icons in the map window.
- If you just want a selected set of times series datasets for your validation list, you must select the time series icons you want. From the Acquisition tab (Acquisition Module), from the Map Window, click
, hold down the SHIFT key, and then click on the time series icons you want to include in your validation list.
- From the Edit menu, click Create Validation List. The Create Validation List dialog (Figure 1) will open. The table on the Create Validation List dialog will display a list of the selected datasets.
- In the Save As box (Figure 1), enter a name, click OK. The Create Validation List dialog closes (Figure 1). By default, the validation list will be saved to the watershed shared directory with the extension .validationEditor.
Creating Validation Lists – Data Status Lists
A useful way to create a validation list is from a data status:
- From the shared directory of a watershed, copy a data status list (.dataStatus) file to a validation list (.validationEditor) file (i.e., cp FlowGages.dataStatus FlowGages.validationEditor).
- As stated in Data Status List, validation lists should only contain "-rev" records. For the Data Validation Editor to be able to edit records from a validation list, each record needs to begin with PRIMARY=. So, for each record in your validation list file, make sure each records starts with PRIMARY=. Figure 2 shows an example of a validation list file - FlowGages.validationEditor that has been edited.
- If you want this validation list to be a comparison dataset, you will need to add a record that does not include PRIMARY=. Remember that CWMS will automatically include the "-raw" record when you access a "-rev" record in the Data Validation Editor. Figure 3 is an example of a validation list file (FlowGages.validationEditor) that has been edited as a comparison dataset.
- Save the file, you now have a validation list that will allow you easy validation of the selected datasets.