Supplementary Information
Record Expansion
When a record is written to disk, it may not be at its final size. For example, if hourly real-time data is stored in HEC-DSS, then at the start of the month (for an hourly block length), only one or a few, values will be written. The next value (or set of values) written will double the disk space needed for that record. HEC-DSS marks that space as "unused" and writes the expanded record at the end of the file. If this continued, the record would grow incrementally, requiring more disk space fore each write.
DSS detects when a record expands and automatically allocates additional disk space in case the dataset continues to expand. The amount of additional space allocated is determined by the number of times the record has expanded before and the record size. If DSS knows the final maximum size (such as with regular-interval time series data), it will allocate that much space on the third expansion. If it does not know the maximum size, then it makes an estimate determined by the number of expansions. (An irregular-interval equalivent of hourly data would only have four expansons at its maximum size.)
Reclaimation
When records expand, or records are deleted, unused space is left in the file. In DSS version 7, that space may be reused, or reclaimed. The amount and location of the space is saved in the internal "reclaimation table", if it meets a minimum threshold.
When new writes occur, the reclaimation table is examined to see if there is an unused segment that can be used for that write. If so, the size needed is subtracted from that segment and that space is used for the write. If not, new space is allocated at the end of the file for the write.
This procedure keeps DSS file sizes smaller and operations more efficient. However, when the space from a deleted record is reused, that record cannot be recovered or "undeleted". If it is anticipated that records may deleted and then need to be recovered, space reclaimation may be turned off. In previous versions of DSS, the only way to remove unused space was through a "squeeze" command, which rebuilt the file.
The following methods is available to set the reclaimation level:
int zreclaimSet(long long *ifltab, int reclaimLevel);
Where reclaimLevel is:
RECLAIM_NONE 1- Don't use space reclaimation
RECLAIM_EXCESS 2 - Reclaim space left over from extending records, etc. (can recover records)
RECLAIM_ALL 3 - Reclaim all unused space, including deleted records (cannot recover)
Squeeze
A HEC-DSS squeeze rebuilds a DSS file. The rebuild will remove any unused (dead) space created from renaming or deleting records, optimize internal address tables (generally making record access faster) and attempt to repair any broken links or file errors. It is akin to defragmenting a hard drive. However, this operation takes time and requires exclusive access to the file, so it is recommended to preform only when warranted, usually when about 10% dead space is shown. For active DSS files, this would typically be less than once a month, or not at all. Once a file has been squeezed, deleted records cannot be recovered.
Because of space recycling in DSS version 7, the need to squeeze a file is less than in version 6. You can call the HecDSSUtilities function "squeezeNeeded()" to return a recommendation if a file should be squeezed or not. Or, you can the squeeze with the argument "true" to squeeze only if recommended. However, a DSS file should function correctly, regardless if it should be squeezed or not, just not as efficiently. Because of this, squeezes are typically done by the user from HEC-DSSVue, not from a regular program.
A squeeze must have complete exclusive access to the file, as it does OS file renaming and similar. You must also have rename and delete privilidges for a squeeze. As such, a squeeze operation should be in a main thread where it can be guaranteed that no other access to the file will be attempted during the squeeze process. The file should be closed before the squeeze and all other threads that could access it, held until it is finished.
To determine if you should squeeze a DSS file, call zsqueezeNeeded, which will return 1 if it is needed:
int zsqueezeNeeded(long long *ifltab)
To squeeze a file, use the following function:
int zsqueeze(const char *dssFilename);
Note, the file must be closed, not accessed by any other process and you must have permission to rename it.