Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem figuring out adjusted gross income (AGI) by zip code using AGI classes

    The data is available free from the IRS for income data by zip code for 2008.

    A00100 is the Adjusted Gross Income (aka AGI), agi_class is the size of the adjusted gross income. This ranges from 1 to 7:
    1 = 'Under $10,000'
    2 = '$10,000 under $25,000'
    3 = '$25,000 under $50,000'
    4 = '$50,000 under $75,000'
    5 = '$75,000 under $100,000'
    6 = '$100,000 under $200,000'
    7 = '$200,000 or more '

    A00100 | zipcode | agi_class | Number of Returns
    -954234 | 10021 | 1 | 3589
    43243455 | 10021 | 2 | 2521
    149940475 | 10021 | 3 | 3939
    243853640 | 10021 | 4 | 3936
    262995399 | 10021 | 5 | 3025
    751195421 | 10021 | 6 | 5333
    10677437299 | 10021 | 7 | 7477


    I need to come up with the Average Adjusted Gross Income for each zip code (only one per zip code). How can I do this in Stata? Thanks!
    Last edited by Josh Weinstein; 28 Jan 2018, 15:41.

  • #2
    I'm not sure I understand the data you show. Would I be correct in interpreting it as follows:

    A00100 gives the total adjusted gross income for all returns filed from that zipcode by people in that agi class, the number of such returns being given by the "Number of returns?"

    Assuming that's right, your first step will be to import this data into Stata. It is clearly not in Stata now because "Number of Returns" is not a legal Stata variable name. I will leave that step to you. It looks like -import delmited- would be the command for that part.

    Once it's in Stata, I'll assume that the number of returns field in that file has been given a legal variable name, such as n_returns.

    Then to get AGI for each zip code you want to aggregate up the total of A00100 and the total of n_returns, and divide the former by the latter:

    Code:
    collapse (sum) A00100 n_returns, by(zipcode)
    gen avg_agi = A00100/n_returns
    In the future, it is premature to ask for help with code when you don't even have your data in Stata. The solution to many problems depends on details of the Stata data set that are not conveyed by listings of text files, or even -list- output from Stata. (And whatever you do, never use a screenshot to show data.) Once you have your data in Stata, it is appropriate to ask for help with code, and requests for help with code should always be accompanied by an example of the data. (Explanations of the data in words are almost never sufficient.) The most helpful way to show example data from a Stata dataset is with the -dataex- command.

    Please read the Forum FAQ for excellent advice on how to get the most out of your Statalist experience. Pay particular attention to #12, which explains the use of -dataex-.

    Comment


    • #3
      I am a complete newbie to all things Stata and posting here - so thank you for your patience and helping me with this issue. I will definitely be more mindful of proper posting etiquette moving forward

      Comment

      Working...
      X