
No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • New package getcensus on SSC: Load American Community Survey data from the Census Bureau API into Stata

    With thanks to Kit Baum, I’m pleased to share that a new package getcensus is available on SSC.

    getcensus loads American Community Survey (ACS) data from the U.S. Census Bureau API into Stata. It also allows users to search the API data dictionaries for ACS variables and tables of interest. The package requires Stata version 13.1 or later.
    ssc install getcensus
    Users may load a single ACS table or a selection of ACS data points into memory by specifying a table ID or list of variable IDs, respectively, to getcensus. Because users rarely know these IDs offhand, getcensus catalog allows users to load the API data dictionaries into memory, optionally filtered by search term.

    Features include:
    • Support for 24 geography types and many geographic components
    • Retrieve data for multiple years with a single command
    • A dialog box for users less familiar with Stata syntax
    The online documentation at provides a complete description of features and options.

    getcensus uses the Census Bureau Data API but is not endorsed or certified by the Census Bureau.

    Example 1

    Load ACS variables S1701_C02_001 (population in poverty) and S1701_C03_001 (poverty rate) for the state of Virginia (FIPS code 51) for the most recent five years
    getcensus S1701_C02_001 S1701_C03_001, years(2015/2019) statefips(51) clear
    A glimpse at the retrieved data:
    . list, noobs abbreviate(14)
      | year   state       name   s1701_c02_001e   s1701_c02_001m   s1701_c03_001e   s1701_c03_001m |
      | 2015      51   Virginia           909346            25335             11.2               .3 |
      | 2016      51   Virginia           896440            28235               11               .3 |
      | 2017      51   Virginia           874483            27673             10.6               .3 |
      | 2018      51   Virginia           884647            27692             10.7               .3 |
      | 2019      51   Virginia           822775            24460              9.9               .3 |

    By default, data retrieved by getcensus will include metadata from the API data dictionaries as variable labels and notes (shown below).
    . notes s1701_c03_001e
      1.  Variable: Estimate!!Percent below poverty level!!Population for whom poverty status is
    Example 2

    Load table B19013, median household income, for all zip code tabulation areas in the nation. The option sample(5) specifies five-year estimates, the only type available for ZCTAS.
    getcensus B19013, geography(zcta) sample(5) clear
    A glimpse at the retrieved data:
    . list in 1/5, noobs abbreviate(21)
      | year   state   zipcodetabulationarea           geo_id          name   b19013_001e   b19013_001m |
      | 2019      01                   35004   8600000US35004   ZCTA5 35004         62902          7145 |
      | 2019      01                   35005   8600000US35005   ZCTA5 35005         50473          6548 |
      | 2019      01                   35006   8600000US35006   ZCTA5 35006         55909         11656 |
      | 2019      01                   35007   8600000US35007   ZCTA5 35007         73587          4404 |
      | 2019      01                   35010   8600000US35010   ZCTA5 35010         41954          3441 |
    Example 3

    Load the API data dictionary of every variable whose description matches "educational attainment" in the subject tables ("ST")
    getcensus catalog, search(educational attainment) product(ST)
    The first observation of the retrieved dataset:
    . list in 1, noobs
      | table_id |                                  table_name |     variable_id |
      |  S0102PR | POPULATION 60 YEARS AND OVER IN PUERTO RICO | S0102PR_C01_033 |
      |                                                       variable_descrip   |
      |  Estimate!!Total!!EDUCATIONAL ATTAINMENT!!Population 25 years and over   |

  • #2
    Hi Claire-- when I paste the code you suggest I get the following:

    . getcensus S1701_C02_001 S1701_C03_001, years(2015/2019) statefips(51) clear
    (1 var, 14 obs)
    Link to data for 2015
    (1 var, 14 obs)
    Link to data for 2016
    (1 var, 14 obs)
    Link to data for 2017
    (1 var, 14 obs)
    Link to data for 2018
    (1 var, 14 obs)
    Link to data for 2019
    variable state not found

    Note variable state not found I've tried this for several states and get the same error. any ideas?
    thanks Patrick


    • #3
      Claire Zippel you have no idea how helpful this is to me. Out of curiosity, are you able to get estimates for Puerto Rico too? I know they sort of have their own survey.


      • #4
        patrick thornton Apologies for the late response. Would you please file a bug report on the getcensus GitHub page?


        • #5
          Jared Greathouse Yes, you can retrieve Puerto Rico estimates with getcensus by specifying
          . There are some data tables unique to the Puerto Rico Community Survey; you can find their table IDs by running
          getcensus catalog, search("Puerto Rico")


          • #6
            patrick thornton In case you're still having the issue you flagged: It's very likely due to an invalid API key. The next version of getcensus will throw a more informative error message. If you've confirmed that your API key is valid and are still seeing the error, please file an issue on GitHub: Thanks!


            • #7
              Claire Zippel Hello! I am trying to recover data from the US Census using the Stata user-written package getcensus. The command looks easy enough to use (see my code below). But the dataset that getcensus recovers for me has all empty cells where I expected non-empty cells. See screenshot below. What might I be doing wrong?

              loc vars2Get ///
              S0901_C01_032 ///
              S0901_C01_033 ///
              S0901_C01_034 ///
              S0901_C02_032 ///
              S0901_C02_033 ///
              S0901_C02_034 ///
              S0901_C03_032 ///
              S0901_C03_033 ///
              S0901_C03_034 ///
              S0901_C04_032 ///
              S0901_C04_033 ///

              getcensus ///
              `vars2Get' ////
              , years(2012/2021) ///
              sample(5) ///
              geography(tract) ///
              table(DT) ///
              statefips("37") ///
              key($censusKey) ///


              • #8
                Elc Estrera It looks like table S0901 isn't available at the census tract level. It's not accessible from at the tract level: If you go to and try to set the Geography filter to census tract; you'll see "No results found."

                The reason why you're getting an empty dataset, rather than an error message, from getcensus is because that's what the Census API is returning. You can view the API response by pasting the "Link to data" that getcensus prints to the Results pane into a web browser. For example, here are the API results for the 2021 part of your query:


                You'll see that getcensus has imported the data successfully from the API; it's just that the data the API sent is all null.

                Originally posted by Elc Estrera View Post
                Claire Zippel Hello! I am trying to recover data from the US Census using the Stata user-written package getcensus. The command looks easy enough to use (see my code below). But the dataset that getcensus recovers for me has all empty cells where I expected non-empty cells. See screenshot below. What might I be doing wrong?

                loc vars2Get ///
                S0901_C01_032 ///
                S0901_C01_033 ///
                S0901_C01_034 ///
                S0901_C02_032 ///
                S0901_C02_033 ///
                S0901_C02_034 ///
                S0901_C03_032 ///
                S0901_C03_033 ///
                S0901_C03_034 ///
                S0901_C04_032 ///
                S0901_C04_033 ///

                getcensus ///
                `vars2Get' ////
                , years(2012/2021) ///
                sample(5) ///
                geography(tract) ///
                table(DT) ///
                statefips("37") ///
                key($censusKey) ///



                • #9
                  Claire Zippel Thanks!


                  • #10
                    This looks like an awesome tool! However, I'm running into a problem. I am trying to extract data for all tracts in the US but am getting an error message that suggests I need to specify a state. Is there a way around this?

                    getcensus B02001 , geography(tract) sample(5) year(2019)

                    A single state code must be specified in statefips() with geography(tract) r(198);


                    • #11
                      Mary Fischer - No, as the error message indicates, statefips() is a required option with geography(tract). When requesting tract-level data, the Census Bureau API requires a state be specified, and so getcenus does too. You could write a loop that retrieves tract-level data for each state FIPS code with getcensus, and then append it all into a single dataset.

