Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • New panel variable based on generated data?

    I have panel data on income by occupation and year. Using "xtset occupation year" and then "xtline income" I get a nice graph. Now I egen a new variable, medwage, for the median wage by year. medwage doesn't have an occupation. I want to plot the incomes including medwage. I can do the following:

    xtline income, overlay addplot( line medwage year if occupation==1 )

    where the 1 is an arbitrary occupation since medwage repeats on each panel. That works ok but it seems like there should be a simple way to take medwage and insert it into the panel with a new occupation code so I can just run

    xtline income

    and have everything run including the medwage. If there isn't a simple way I will stick with what I am doing. I imagine this problem has been dealt with before but I probably searched on the wrong keywords.

  • #2
    Something like this:
    Code:
    // SET UP TOY DATA
    clear*
    set obs 5
    gen occ = "OCC " + string(_n)
    encode occ, gen(occupation)
    drop occ
    expand 10
    by occupation, sort: gen year = 2000 + _n
    set seed 1234
    gen wage = rpoisson(60000)
    
    //    GRAPH WITHOUT MEDIAN
    xtset occupation year
    xtline wage, overlay name(wage_by_time, replace)
    
    //    ADD A NEW OBSERVATION WITH MEDIAN WAGES BY YEAR
    tempfile copy
    save `copy'
    collapse (p50) wage, by(year)
    gen occupation = 9999
    label define occupation    9999    "MEDIAN", add
    label values occupation occupation
    append using `copy'
    
    //    RE-DRAW GRAPH WITH MEDIAN
    xtline wage, overlay name(with_median, replace)
    I have made some assumptions about the nature of your data, and you may have to adapt this code to your actual data.

    I strongly urge you not to save the resulting modified data set. It is all to easy to forget that it contains both individual observations and summary statistics. If you re-use this data set, you may end up performing some calculations with it that will end up including the median as if it were an additional individual observation, with erroneous results.

    In the future, when asking for help with code, it is more efficient if you show an example of your data. The correct solution often depends on the exact layout of the data and sometimes on metadata as well. To show data examples, please use the -dataex- command to do so. If you are running version 15.1 or a fully updated version 14.2, it is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    When asking for help with code, always show example data. When showing example data, always use -dataex-.

    Comment


    • #3
      Save and append. Got it. Thanks.

      Comment

      Working...
      X