Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Navigating down the tree of a nested JSON that is returned from an API for import into Stata

    Hi all,

    I'm struggling a bit to understand the insheetjson documentation, at least as it pertains to navigating the JSON that gets returned to me from the Bureau of Economic Analysis's API. I am trying to retrieve the regional price parity (RPP) data. I realize that I can get this same data from import fred, but for reasons that aren't worth getting into, I need to get it directly from the BEA. To get a sense of the API return, I have been running the below code (you will need to enter your own BEA API key for this to work for you.

    Code:
    local url https://apps.bea.gov/api/data/?UserID={YOUR API KEY HERE}&method=GetData&datasetname=Regional&T ableName=SARPP&LineCode=1&Year=2018&GeoFips=STATE& ResultFormat=JSON
    insheetjson using "`url'", showresponse // flatten
    This returns the following:

    Code:
    Response from server:
    {
     "BEAAPI" :
      {
       "Request" :
        {
         "RequestParam" :
          [
           {
            "ParameterName" : "YEAR",
            "ParameterValue" : "2018"
           },
           {
            "ParameterName" : "LINECODE",
            "ParameterValue" : "1"
           },
           {
            "ParameterName" : "TABLENAME",
            "ParameterValue" : "SARPP"
           },
           {
            "ParameterName" : "DATASETNAME",
            "ParameterValue" : "REGIONAL"
           },
           {
            "ParameterName" : "METHOD",
            "ParameterValue" : "GETDATA"
           },
           {
            "ParameterName" : "USERID",
            "ParameterValue" : "402D3B19-A3ED-4D30-AAFE-363B1F7F0115"
           },
           {
            "ParameterName" : "GEOFIPS",
            "ParameterValue" : "STATE"
           },
           {
            "ParameterName" : "RESULTFORMAT",
            "ParameterValue" : "JSON"
           }
          ]
        },
       "Results" :
        {
         "Statistic" : "RPPs: All items",
         "UnitOfMeasure" : "Index",
         "PublicTable" : "SARPP Regional Price Parities by state",
         "UTCProductionTime" : "2022-07-18T14:08:17.377",
         "NoteRef" : " ",
         "Dimensions" :
          [
           {
            "Name" : "Code",
            "DataType" : "string",
            "IsValue" : "0"
           },
           {
            "Name" : "GeoFips",
            "DataType" : "string",
            "IsValue" : "0"
           },
           {
            "Name" : "GeoName",
            "DataType" : "string",
            "IsValue" : "0"
           },
           {
            "Name" : "TimePeriod",
            "DataType" : "string",
            "IsValue" : "0"
           },
           {
            "Name" : "DataValue",
            "DataType" : "numeric",
            "IsValue" : "1"
           },
           {
            "Name" : "CL_UNIT",
            "DataType" : "string",
            "IsValue" : "0"
           },
           {
            "Name" : "UNIT_MULT",
            "DataType" : "numeric",
            "IsValue" : "0"
           }
          ],
         "Data" :
          [
           {
            "Code" : "SARPP-1",
            "GeoFips" : "00000",
            "GeoName" : "United States",
            "TimePeriod" : "2018",
            "CL_UNIT" : "Index",
            "UNIT_MULT" : "0",
            "DataValue" : "100.000"
           },
           {
            "Code" : "SARPP-1",
            "GeoFips" : "01000",
            "GeoName" : "Alabama",
            "TimePeriod" : "2018",
            "CL_UNIT" : "Index",
            "UNIT_MULT" : "0",
            "DataValue" : "87.968"
           },
    (Note, this is just a portion, the full JSON is much longer.)

    And when I uncomment the flatten option, this is the result:

    Code:
    Response from server:
            BEAAPI:Request:RequestParam:1:ParameterName = GEOFIPS
            BEAAPI:Request:RequestParam:1:ParameterValue = STATE
            BEAAPI:Request:RequestParam:2:ParameterName = RESULTFORMAT
            BEAAPI:Request:RequestParam:2:ParameterValue = JSON
            BEAAPI:Request:RequestParam:3:ParameterName = LINECODE
            BEAAPI:Request:RequestParam:3:ParameterValue = 1
            BEAAPI:Request:RequestParam:4:ParameterName = USERID
            BEAAPI:Request:RequestParam:4:ParameterValue = 402D3B19-A3ED-4D30-AAFE-363B1F7F0115
            BEAAPI:Request:RequestParam:5:ParameterName = TABLENAME
            BEAAPI:Request:RequestParam:5:ParameterValue = SARPP
            BEAAPI:Request:RequestParam:6:ParameterName = DATASETNAME
            BEAAPI:Request:RequestParam:6:ParameterValue = REGIONAL
            BEAAPI:Request:RequestParam:7:ParameterName = YEAR
            BEAAPI:Request:RequestParam:7:ParameterValue = 2018
            BEAAPI:Request:RequestParam:8:ParameterName = METHOD
            BEAAPI:Request:RequestParam:8:ParameterValue = GETDATA
            BEAAPI:Results:Statistic = RPPs: All items
            BEAAPI:Results:UnitOfMeasure = Index
            BEAAPI:Results:PublicTable = SARPP Regional Price Parities by state
            BEAAPI:Results:UTCProductionTime = 2022-07-18T14:07:57.947
            BEAAPI:Results:NoteRef =  
            BEAAPI:Results:Dimensions:1:Name = Code
            BEAAPI:Results:Dimensions:1:DataType = string
            BEAAPI:Results:Dimensions:1:IsValue = 0
            BEAAPI:Results:Dimensions:2:Name = GeoFips
            BEAAPI:Results:Dimensions:2:DataType = string
            BEAAPI:Results:Dimensions:2:IsValue = 0
            BEAAPI:Results:Dimensions:3:Name = GeoName
            BEAAPI:Results:Dimensions:3:DataType = string
            BEAAPI:Results:Dimensions:3:IsValue = 0
            BEAAPI:Results:Dimensions:4:Name = TimePeriod
            BEAAPI:Results:Dimensions:4:DataType = string
            BEAAPI:Results:Dimensions:4:IsValue = 0
            BEAAPI:Results:Dimensions:5:Name = DataValue
            BEAAPI:Results:Dimensions:5:DataType = numeric
            BEAAPI:Results:Dimensions:5:IsValue = 1
            BEAAPI:Results:Dimensions:6:Name = CL_UNIT
            BEAAPI:Results:Dimensions:6:DataType = string
            BEAAPI:Results:Dimensions:6:IsValue = 0
            BEAAPI:Results:Dimensions:7:Name = UNIT_MULT
            BEAAPI:Results:Dimensions:7:DataType = numeric
            BEAAPI:Results:Dimensions:7:IsValue = 0
            BEAAPI:Results:Data:1:Code = SARPP-1
            BEAAPI:Results:Data:1:GeoFips = 00000
            BEAAPI:Results:Data:1:GeoName = United States
            BEAAPI:Results:Data:1:TimePeriod = 2018
            BEAAPI:Results:Data:1:CL_UNIT = Index
            BEAAPI:Results:Data:1:UNIT_MULT = 0
            BEAAPI:Results:Data:1:DataValue = 100.000
            BEAAPI:Results:Data:2:Code = SARPP-1
            BEAAPI:Results:Data:2:GeoFips = 01000
            BEAAPI:Results:Data:2:GeoName = Alabama
            BEAAPI:Results:Data:2:TimePeriod = 2018
            BEAAPI:Results:Data:2:CL_UNIT = Index
            BEAAPI:Results:Data:2:UNIT_MULT = 0
            BEAAPI:Results:Data:2:DataValue = 87.968
            BEAAPI:Results:Data:3:Code = SARPP-1
            BEAAPI:Results:Data:3:GeoFips = 02000
            BEAAPI:Results:Data:3:GeoName = Alaska
            BEAAPI:Results:Data:3:TimePeriod = 2018
            BEAAPI:Results:Data:3:CL_UNIT = Index
            BEAAPI:Results:Data:3:UNIT_MULT = 0
            BEAAPI:Results:Data:3:DataValue = 105.169
            BEAAPI:Results:Data:4:Code = SARPP-1
            BEAAPI:Results:Data:4:GeoFips = 04000
            BEAAPI:Results:Data:4:GeoName = Arizona
            BEAAPI:Results:Data:4:TimePeriod = 2018
            BEAAPI:Results:Data:4:CL_UNIT = Index
            BEAAPI:Results:Data:4:UNIT_MULT = 0
            BEAAPI:Results:Data:4:DataValue = 97.638
            BEAAPI:Results:Data:5:Code = SARPP-1
            BEAAPI:Results:Data:5:GeoFips = 05000
            BEAAPI:Results:Data:5:GeoName = Arkansas
            BEAAPI:Results:Data:5:TimePeriod = 2018
            BEAAPI:Results:Data:5:CL_UNIT = Index
            BEAAPI:Results:Data:5:UNIT_MULT = 0
            BEAAPI:Results:Data:5:DataValue = 88.360
            BEAAPI:Results:Data:6:Code = SARPP-1
            BEAAPI:Results:Data:6:GeoFips = 06000
            BEAAPI:Results:Data:6:GeoName = California
            BEAAPI:Results:Data:6:TimePeriod = 2018
            BEAAPI:Results:Data:6:CL_UNIT = Index
            BEAAPI:Results:Data:6:UNIT_MULT = 0
            BEAAPI:Results:Data:6:DataValue = 110.755
            BEAAPI:Results:Data:7:Code = SARPP-1
            BEAAPI:Results:Data:7:GeoFips = 08000
            BEAAPI:Results:Data:7:GeoName = Colorado
            BEAAPI:Results:Data:7:TimePeriod = 2018
            BEAAPI:Results:Data:7:CL_UNIT = Index
            BEAAPI:Results:Data:7:UNIT_MULT = 0
            BEAAPI:Results:Data:7:DataValue = 99.749
    To parse this, I have tried a number of variations on this code:

    Code:
    insheetjson results data using "`url'", table("BEAAPI") col("Results:Data") replace offset(0)
    but I just don't see in the documentation how to navigate down to the Data level of the JSON to extract the data points I'm looking for at that level (which contain the RPP values for each state). Any tips much appreciated!


  • #2

    CJ, try

    Code:
     
    
    . insheetjson using "`url'", printonly tableselector(BEAAPI Results Data) column("Code" "GeoFips" "GeoName" "TimePeriod" "CL_UNIT" "UNIT_MULT" "DataValue")
    
                    1            2            3            4            5            6            7
    
        +--------------------------------------------------------------------------------------------+
      1 |        Code      GeoFips      GeoName   TimePeriod      CL_UNIT    UNIT_MULT    DataValue  |
        +--------------------------------------------------------------------------------------------+
                       1               2               3               4               5               6               7
        +-----------------------------------------------------------------------------------------------------------------+
      1 |        SARPP-1           00000   United States            2018           Index               0         100.000  |
      2 |        SARPP-1           01000         Alabama            2018           Index               0          87.968  |
        +-----------------------------------------------------------------------------------------------------------------+
    Note, you need to create the variables in the dataset prior to populate it.

    Comment


    • #3
      Thanks so much, Luis! This seems to get me very close! My last challenge is that when I run:

      Code:
      set obs 1 
      
      gen Code = ""
      gen GeoFips = "" 
      gen GeoName = ""
      gen TimePeriod = "" 
      gen CL_UNIT = ""
      gen UNIT_MULT = ""
      gen DataValue = ""
      
      insheetjson Code GeoFips GeoName TimePeriod CL_UNIT UNIT_MULT DataValue using "`url'", tableselector(BEAAPI Results Data) column("Code" "GeoFips" "GeoName" "TimePeriod" "CL_UNIT" "UNIT_MULT" "DataValue") replace
      My results look like the below, which seems to just be a splitting of the Code field across all of the variables:

      Code:
      . list in 1/3
      
           +---------------------------------------------------------------------+
           | Code   GeoFips   GeoName   TimePe~d   CL_UNIT   UNIT_M~T   DataVa~e |
           |---------------------------------------------------------------------|
        1. |    S         0         U          2         I          0          1 |
        2. |    S         0         A          2         I          0          8 |
        3. |    S         0         A          2         I          0          1 |
      Any thoughts?

      Comment


      • #4
        Hi CJ, insheetjson demands some pre-processing, prior populating dataset

        Code:
        . clear
        
        . local column "Code GeoFips GeoName TimePeriod CL_UNIT UNIT_MULT DataValue"
        
        . tokenize `column'
        
        . foreach v in `column'{
          2. gen str240 `v'=""
          3. }
        
        . insheetjson using "sample.json", tableselector(BEAAPI Results Data) column(Code GeoFips GeoName TimePeriod CL_UNIT UNIT_MULT DataValue) replace
         2 observations updated/written.
        
        . compress
          variable Code was str240 now str7
          variable GeoFips was str240 now str5
          variable GeoName was str240 now str13
          variable TimePeriod was str240 now str4
          variable CL_UNIT was str240 now str5
          variable UNIT_MULT was str240 now str1
          variable DataValue was str240 now str7
          (3,276 bytes saved)
        
        . list
        
             +------------------------------------------------------------------------------+
             |    Code   GeoFips         GeoName   TimePe~d   CL_UNIT   UNIT_M~T   DataVa~e |
             |------------------------------------------------------------------------------|
          1. | SARPP-1     00000   United States       2018     Index          0    100.000 |
          2. | SARPP-1     01000         Alabama       2018     Index          0     87.968 |
             +------------------------------------------------------------------------------+

        Comment

        Working...
        X