Regular-Interval Data
Regular-interval time series data is stored in "standard size" blocks whose length depends upon the time interval of the data. The storage blocks sizes are: daily, monthly, decade, and century. The storage block size is automatic, and the user doesn't need to figure this out.
For example, daily time interval data are stored in blocks of one year (365 or 366 values) while monthly values are stored in blocks of ten years (120 values). If data does not exist for a portion of the full block, the missing values are set to the missing data flag Heclib.UNDEFINED_DOUBLE, which is actually - Float.MAX_VALUE.
The starting and ending times of a block correspond to standard calendar conventions. Is it true that the time-stamp is not stored, for regular data, but rather computed based on the conventions used?For example, for period average monthly data in the 1950's, part D (date part) of the pathname would be 01JAN1950, regardless of when the first valid data occurred (e.g., it could start in 1958). The 1960's block starts on 01JAN1960.
Average period data values are stored at the end of the period over which the data is averaged. For example, daily average values are given a time label of 2400 hours for the appropriate day and average monthly values are labeled at 2400 hours on the last day of the month. If values occur for times other than the end-of-period time, that time offset is stored in the header array.
For example, if daily average flow reading's are recorded at 6:00 am (i.e., the average flow from 6:01 am of the previous day to 6:00 am of the current day), then an offset of 360 (minutes) will be stored in the header array.
Part E consists of an integer number and an alphanumeric time interval specifying the regular data interval. The valid intervals and block lengths are:
Valid Data Intervals | Seconds In interval | Block Length |
1Year | 31536000 (365 days) | One Century |
1Month | 2592000 (30 days) | One Decade |
Semi-Month | 1296000 (15 days) | |
Tri-Month | 864000 (10 days) | |
1Week | 604800 (7 days) | |
1Day | 86400 | One Month |
12Hour | 43200 | |
8Hour | 28800 | |
6Hour | 21600 | |
4Hour | 14400 | |
3Hour | 10800 | |
2Hour | 7200 | |
1Hour | 3600 | |
30Minute | 1800 | |
20Minute | 1200 | |
15Minute | 900 | |
12Minute | 720 | One Day |
10Minute | 600 | |
6Minute | 360 | |
5Minute | 300 | |
4Minute | 240 | |
3Minute | 180 | |
2Minute | 120 | |
1Minute | 60 | |
30Second | 30 | |
20Second | 20 | |
15Second | 15 | |
10Second | 10 | |
6Second | 6 | |
5Second | 5 | |
4Second | 4 | |
3Second | 3 | |
2Second | 2 | |
1Second | 1 |
Data Compression
Data compression is automatic for regular-interval data. It is based on repeat values only, and is lossless. When a regular-interval time series dataset is stored, the number of repeat values (that follow one-after-the-other) are counted to ensure that it is worth compressing. If so, each value is provided a single bit to indicate if that position is a repeat from the previous value.
Those bits are used to reconstruct the array when it is read.Data compression can reduce some dataset to 3% of their original size, with no loss of precision and typically faster read times (there is less to read).
It works best with precipitation values and missing data flags. There is no data compression for irregular-interval data, so if you can save data in regular-interval format where it makes sense, you'll have a much more efficient file.