Written March 2019

Overview

The 2019 version of the National Structure Inventory (NSI) represents a substantial improvement over previous iterations. It uses a variety of data inputs in order to improve structure locations, structure characteristics, and population at risk estimates. For an explanation of the NSI generation process, please refer to the official NSI documentation (dated March 2019).

Frequently Asked Questions

Users understandably have many questions about the data they are using, here are some of the more frequently asked questions about the 2019 release of the National Structure Inventory. Additional detail is found in the official documentation.

Who can use the NSI data?

The NSI uses CoreLogic data that has a somewhat restrictive use agreement. However, the data is available to Federal, State, Local, Tribal, and Territorial Mission Partners who have a signed Data Use Agreement with HIFLD; this includes USACE, FEMA, and other agencies.

What do the various field names and data values mean?

Please refer to the Technical Documentation. It provides definitions for each field and character values.

What Year do the Population and Prices represent in the 2019 NSI?

Population was indexed to 2017 levels and prices were indexed to 2018. Depending on the analysis year and level of decision-making, it may not be possible or necessary to index the values to the present year.

Where is data available?

The NSI covers all 50 states and the District of Columbia. Many input data sources were not available for territories such as Puerto Rico; as such, no new data was created for these areas. Some counties and other jurisdictions within covered states may have also been missing data; the NSI generator attempts to fill data gaps whenever possible, at a minimum 2014 HAZUS based inventories were used to fill such gaps.

Why does RES1 content value equal the structure value?

The EGM depth-damage curves expect content value to equal structure values. HEC consequences software, such as HEC-FIA and HEC-LifeSim, use the EGM curves for RES1s by default. If you are not using the EGM curves, then you should use more accurate percentage assumptions, such as 50%.

How reliable are the provided foundation heights?

It depends on the level of decision-making and study area. The foundation types are randomly assigned to structures based on probabilities that vary by several different conditions. Users should confirm that the distribution of foundation heights is reasonable for their study area.

Common Issues

This section attempts to explain common issues that remain in the dataset, explains how the problems occurred, offers advice on how users may mitigate the issue if necessary, and briefly describes how future versions of the NSI Generator may attempt to handle similar issues in the future.

Structures in the street and other odd places


Figure 1 – Misplaced Structures in roadways
One of the main improvements of the 2019 version of the NSI is the placement of structures within Census blocks. With the incorporation of parcel data and Microsoft building footprints, most structures are located on the centroid of the footprints of themselves.
However, many Esri based structures are initially located based on street address; if the street address's XY location is outside of a parcel, then the NSI generator is unable to improve the structure's location. This is because the NSI generator attempts to pair structures with a building footprint that lies within the same parcel polygon; if they are not in a parcel polygon then they cannot be paired with a footprint.
This may not be significant issue for many use cases, as the street address of structure may be at a similar ground elevation as the true location of the structure. However, if exact structure placement is a concern, users may identify Esri based structures by examining the "Source" field; Esri based structures are marked "E." Users can then manually refine the structure placement of structures. It is important to note that every unique business is given its own record/"dot" so multiple business can be located within the same footprint (malls, office buildings, etc).
HAZUS based structure, marked "H" in the source field, may also appear in strange locations. These structures are used when parcel or Esri data is unavailable. If there are no parcel polygons for that area, or, if there are more HAZUS structures then there are footprints, HAZUS structures will be left in their initial locations (these structures were uniformly distributed across "developed" portions of the Census block).
Future versions of the NSI generator will attempt to address these issues by allowing commercial structures to be paired with unoccupied footprints that lay within commercial parcels within the same Census block as their initial location.

Structures Reporting Unrealistic Number of Stories


Figure 2 – Estimated structure heights relative to Google Earth 3d Building representations of development
The NSI estimates each structure's number of stories individually. For RES1s this process is relatively simple: each structure is randomly assigned a number of stories depending on the Census region and construction year of the structure.
For other types of structures, the process is complicated, but ultimately decided by two main factors: 1) the estimated square footage that a structure needs to accommodate the stated number of residential units or employees, and 2) the estimated footprint size of the structure. If the square footage estimate is too large, or if the footprint size is too small, then then the NSI Generator will estimate that the structure is taller than it actually is. In the example above, the available Esri and LEHD both suggested the building had ~10,000 employees; however, the building was actually a staffing office for nurses that work offsite.
For life safety purposes, the poor estimate of number of stories might not always matter. Regardless of whether a structure is 5 stories or 50 stories, few floods are likely to submerge it. However, being two stories instead of one may matter a great deal for that particular structure's occupants. It's important to note that algorithm is not perfect, and that if a user's study area contains a critical structure with a large PAR, they should manually correct the number of stories field if necessary. Furthermore, if the number of stories estimate is off by a substantial degree, it is likely that the square footage and structure value is also poorly estimated and may need to be corrected.
Future versions of the NSI Generator will attempt to address this issue by limiting outlier data points and refining assumptions around the square footage per employee for each occupancy type. Future versions may also attempt to estimate number of stories based on available LIDAR data.

Missing Structures


Figure 3 – Satellite imagery of a recently constructed neighborhood that has no matching NSI structures
There may be some structures that appear in satellite imagery that do not have matching NSI structures; this could be because of any many reasons. In the example figure, the development is more recent than the parcel data available (which only has structures built in 2017 or earlier). In this case, users would have to manually create new structures if they wish to account for the development.
In other cases, the commercial structure's XY location may be in a nearby street. In this case, users may simply manually move the structure to its matching footprint.
It is also possible that the input data did not correctly identify the correct number of structures. For example, a commercial complex may have a single record despite there being several distinct physical structures on the property. In this case, users may need to create new structures and value, or, distribute the value and PAR of existing structures across more spatial locations.
In some parcel data, a large residential subdivision may simply be recorded as a single "Single Family Residence." In this case, the created structure may absorb the population of an entire census block; the NSI generator attempts to recognize and address this issue by splitting any RES1 that receives more than 15 PAR into multiple copies of the structure that each have no more than 10 PAR. These structures are marked "S" in the "Stacked" field.

Structures under representing population


Figure 4 – Data table focusing on RES5s and RES6s along with their population estimates
RES4s, RES5s, and RES6s are Hotels, Institutional Dormitories, and Nursing Homes, respectively. The NSI Generator pulls their data from Esri, as such they are treated as commercial structures rather than residential structures. However, these structures are unique as they likely have both commercial and residential PAR and the 2019 version of NSI only accounts for commercial workers.
This means that any semi-permanent residents of RES5s and RES6s would not be assigned to these structures; instead, they would be assigned to other Residential structures within the census block. While RES4s may have no permanent PAR that could be counted by Census, hotel guests would likely outnumber workers. For large study areas, this may not matter; much of the PAR that should be located at these structures are instead located elsewhere within the study area – possibly even within the same Census block. However, if Census blocks are only partially flooded and if a particular nursing home, hotel, prison, etc. makes up a significant part of the PAR, then it may be appropriate to manually adjust the structure attributes of these neglected structures.
Future versions of the NSI will attempt to address this weakness by allowing RES5s and 6s to pull population from both residential and commercial pools, by creating a new customer/guest pool that RES4s may pull from, and by assuming a significant portion of RES6 PAR is over 65 years of age.

Structures with no people


Figure 5 – Four HAZUS based structures with no population assigned and the attributes of one of these
Some NSI structures have no population. This should be rare for residential structures, because the population growth adjustment tries to ensure that there is a residential population pool for any Census block that saw residential unit growth even if that Census block had no population in the 2010 Census.
Zero population will be most common for Commercial structures, which pull population from the Commercial pool established by LEHD data. If there is a misalignment between a source data's business location and the LEHD business location, then there may be no population for a commercial structure to pull from. This appears to be most common for HAZUS based structures, which are only used when Esri did not identify any businesses located within that Census block. Therefore, a zero population HAZUS structure suggests that neither Esri nor LEHD believed there was actually any businesses/employees located at that census block. Nonetheless, the NSI Generator leaves these structures in the inventory so that users can examine the situation themselves if they so choose; if users identify a commercial structure is present, then they can modify the structure location and attributes as necessary.
It is important to note that because square footage is partially estimated by number of employees, zero population structures will also usually have zero structure value. Therefore, if these structures do not actually exist, then the only thing that should be overestimated within a model is the structure count, not life loss or dollar damages. Users may simply delete these structures if necessary.

Structures with too many people


Figure 6 – Attributes of one structure that was assigned too many people
The NSI Generator algorithmically added County population growth between 2017 and 2010 to individual Census blocks, and Census blocks that saw the biggest growth in housing units will have population added to their blocks before census blocks that did not see an increase in housing units.
However, on rare occasions, the input data for calculating housing units has significant errors. In this example, a parcel indicated that it had 4,444 housing units; an inspection of satellite imagery and Google Street View indicates that this structure is far too small to possess that many housing units. Regardless, because the NSI generator assumed 4,444 housing units, the population growth algorithm added a substantial amount of population to the block that was later assigned to structure itself (population assignment to residential structures is weighted by housing units).
Users should inspect such outliers and make adjustments as necessary. If the erroneously populated structure could affect life loss results, it may be appropriate to simply delete the structure. If the deleted population is significant and there are known new sources of development that are within the same county and also within the study area, then it may be appropriate to manually add the population to those structures.
Future versions of the NSI Generator will attempt to address such issues by looking for outliers and putting in limits for housing units and population increases.

Structures With Improper Foundation Heights


Figure 7 – Google Street View image of a structure compared with its assigned attributes
Foundation heights are often a driving assumption for economic damages. It can be critical to not only assume the correct mean and standard deviation within a study area, but also to stratify such distributions when appropriate. The NSI attempts to account for this by having multiple different distributions for, 1) different Census regions, 2) Riverine vs Coastal areas, 3) A and V zones within coastal areas, and 4) whether the structure was built before or after NFIP entry.
Nonetheless, 1) these are groups are still very broad, 2) the distributions use potentially outdated data, 3) data may not be generalizable to any particular study area, and 4) structures are randomly assigned a foundation type and height from within the distribution, meaning that even if the distribution is accurate overall any particular structure may not match with reality.
This is also true of the "Num_Story" field for RES1s, structures are randomly assigned a number of stories based on their Census region and year built.
Users should do due diligence to ensure that the assumed distributions are reasonable for their study area and level of decision-making. Furthermore, users should ensure that any individual structures that may have a significant effect on the overall results have accurate attributes.
Future versions of the NSI Generator will attempt to address this issue by incorporating results for a large nationwide survey of structure characteristics. This will potentially provide data that not only possesses more reasonable and up-to-date distribution of structure characteristics, but also, through machine learning or other methods, predict the attributes of individual structures.

Structures that were cloned


Figure 8 – Duplicated parcels
The NSI Generator is reliant on its data inputs. For the most part, these inputs are clean data with few inconsistencies or errors. However, during the course of development, it became evident that there are sometimes duplicate data entries in both the residential parcel data and the commercial Esri data.
The NSI Generator attempts to prevent double counting structures and cascading issue by eliminating duplicates before adding population or structure values. To do so, it runs a variety of checks to look for duplicated attributes at similar locations. However, because parcel data comes from many different counties that can all be wrong in different ways (likewise with data reported from businesses), the generator is not always successful in identifying duplicates.
In the example image above, parcel data indicated that there were a number of units and buildings reported at this site. Unfortunately, there were 89 different copies of that parcel (one for each condominium unit at the site) and many of them had slightly different attribute entries, which caused the generator's duplicate filter to miss the problem and create many more units than it should have.
This underlying issue may manifest itself as a variant of previously discussed problems. Users should examine their inventories for outliers such as buildings that have an unreasonable number of units, employees, PAR, or number of stories.
Future versions of the NSI Generator will attempt to address this issue by building a more robust set of duplicate filter criteria.

Unique Structures


Figure 9 – Parcel polygon for a state penitentiary
Certain types of structures and developments will typically need special attention from users of the National Structure Inventory. For instance, in the above figure, a state penitentiary is represented by a single parcel. The parcel contains no information on the number buildings or "units." Instead, the parcel is simply recorded as a "Public" land use type. LEHD information records the number of workers (guards, etc.), but data is lacking on prisoners.
In such cases where parcel and commercial data is unhelpful, the NSI Generator attempts to fill data gaps with previous versions of the NSI that were largely based on HAZUS/Census data. However, for unknown reasons, the prisoners of this facility were recorded at a different Census block that is located more than a mile way, far outside of the parcel polygon and the potential study area.
Many Microsoft building footprints do exist for the area and the NSI Generator moved what structures it did have data for to the largest footprints among them. However, this means that it moved a small group of RES1s to the higher security prison buildings.
In special cases such as this, when multiple data sources fail to provide useful information, there is little ability for algorithmic improvements. Users may need to completely remove the generated inventory or make significant modifications to the PAR and structure characteristics of the structures that were generated.