View on GitHub

Reading Natality Data

In the United States, State laws require birth certificates to be completed for all births, and Federal law mandates national collection and publication of births and other vital statistics data. The National Vital Statistics System, the Federal compilation of this data, is the result of the cooperation between the National Center for Health Statistics (NCHS) and the states to provide access to statistical information from birth certificates.

The National Center for Health Statistics makes annual natality micro-data datasets including all births occurring in a given calendar year within the United States publicly available. The data are based on information abstracted from birth certificates filed in vital statistics offices of each State and District of Columbia.

Since 2005, publicly available national vital statistics micro-data files that include state, county, or larger city geography are no longer available without approval. This restriction was established to preserve confidentiality. The data in the restricted files with geography information are at such a level of detail that open access to the public is not appropriate.

Restricted-access annual natality datasets with state, county and geography information are provided to researchers upon request by the NCHS in text format. Researchers interested in obtaining these data should contact the National Center for Health Statistics. The restricted-access datasets do not include a dictionary to read the data into standard statistical software or to label variables and variable categories.

As a service to the research community, we are making available the Stata and R code to read restricted-access datasets with state, county, and geographic information. This code is similar to the code used for public-access natality data provided by the NBER natality data, but it includes variables capturing state, county and geography. It also includes value labels for all variables.

We currently provide code to read restricted-access natality data into Stata and R for years 2010 to 2023. We will keep expanding this resource, adding natality data for prior and more current years. Please note that these files might include typos and errors. If you use them and find errors, please contact Sarah Galbenski (sarah.galbenski@princeton.edu).

We acknowledge funding from the Russell Sage Foundation. Website created by Amy Johnson, updated by Sarah Galbenski.

Suggested citation:

Stata files:

Galbenski, Sarah, Hye Jee Kim, and Florencia Torche. 2025. “Code for Reading Restricted Access Natality Data into Stata and Assigning Variable and Value Labels, Year(s) [XXXX-YYYY]”, Princeton University, Office of Population Research. https://florenciatorche.github.io/ReadNatalityData/

R files:

Galbenski, Sarah, Hye Jee Kim, and Florencia Torche. 2025. “Code for Reading Restricted Access Natality Data into R and Assigning Variable and Value Labels, Year(s) [XXXX-YYYY]”, Princeton University, Office of Population Research. https://florenciatorche.github.io/ReadNatalityData/


Stata do files R scripts
2010 2010
2011 2011
2012 2012
2013 2013
2014 2014
2015 2015
2016 2016
2017 2017
2018 2018
2019 2019
2020 2020
2021 2021
2022 2022
2023 2023