Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • new variable for each panel unit

    Dear Stata forum colleagues,
    I have learnt a lot by following older threads before, but this is my first personal post.
    I have a panel data set of Indian state level data(the panel unit being each State) between 1990-2015 . I currently have one of my variables as GDP per capita over this time frame for each state. However, I would like to create a variable 'Initial GDP per capita' that gives me the 1990 value of GDP per capita for each state, but the variable needs to remain the same (i,e the 1990 value) throughout the time period (for each state).
    Additionally, if in case the 1990 value of GDP per capita is missing, I would like STATA to use the 1991 value(or the next available value).
    Can someone help me write a command for this?
    Many thanks,
    Nishanth

  • #2
    Welcome to the Stata Forum / Statalist.

    The best approach to your query is sharing some data. It can be a toy example or just a fraction of the real data. For this, you may use - data-ex - and share data under code delimiters.
    Best regards,

    Marcos

    Comment


    • #3
      See discussion within https://www.stata-journal.com/articl...article=dm0055

      Comment


      • #4
        Dear Marcos,
        here is my sample for one particular state. I want the initial value of percapitaNDP in year 1992 to become a new variable 'initial per capita GDP' for all the years.
        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input int Year float ln_productivityofinvestment1 double percapitaNDP
        1991          .                 .
        1992  -10.97565 30876.13192121823
        1993  -11.03196 29501.60871010496
        1994 -11.010284 28188.27561382569
        1995 -10.890392 29309.57298519551
        1996  -11.32392 30677.93587907054
        1997  -9.693542 32361.78244014454
        1998 -11.398554 31134.05684369555
        1999 -11.829152 34756.41750442585
        2000 -10.897416 35900.52092402692
        2001 -11.089593 38569.73058889104
        2002 -10.518782 40056.76195405945
        2003 -10.838675 40352.30652898338
        2004 -10.756035 43794.12090939667
        2005 -11.007873 46456.34920634921
        2006  -11.11849 48639.66697790228
        2007 -11.201907 53324.85216308746
        2008 -11.838274 59445.30034235917
        2009 -12.057322 60368.73638344227
        2010  -12.33189 63847.72798008092
        2011  -12.10056 67482.41518829754
        2012  -12.31145             69000
        2013  -13.37815             68865
        2014 -12.048196             72254
        2015 -11.541812             79174
        2016 -11.729563             87217
        2017          .             96374
        2018          .                 .
        end

        Comment


        • #5
          Dear Nick,
          Thank you for that reference. Did you imply me to use the command on page 312? i.e by foreign : egen min_price = min(price/(mpg > 25)).
          I tried this, with a mirroring command for my dataset.
          by State1 : egen InitialNDPpercapita = percapitaNDP(Year=1992)

          However, I get the error .
          unknown egen function percapitaNDP()
          r(133);

          Anyway around this?

          Comment


          • #6
            Unfortunately, the data you shared in #4 cannot be taken as a panel data structure.

            However, in #1 there is a clear statement about it.

            What is more, you wish to create a new variable with 1990's GDP, yet there is no year = 1990.

            It is quite confusing.

            In spite of the information in #4 concerning the use of year = 1992 instead of 1990, the above-mentioned remarks still demand consideration.
            Best regards,

            Marcos

            Comment


            • #7
              My apologies Marcos.
              I extracted only the required variable and one panel unit data using -dataex-. Also, I mentioned 1990 in #1, however, I actually meant 1992 or the first year value for which 'per capita GDP' variable is available for the state. Please find a panel data structure extract below.
              Code:
              * Example generated by -dataex-. To install: ssc install dataex
              clear
              input str20 State int Year float ln_productivityofinvestment1 double percapitaNDP
              "Andhra Pradesh" 1991          .                 .
              "Andhra Pradesh" 1992  -10.97565 30876.13192121823
              "Andhra Pradesh" 1993  -11.03196 29501.60871010496
              "Andhra Pradesh" 1994 -11.010284 28188.27561382569
              "Andhra Pradesh" 1995 -10.890392 29309.57298519551
              "Andhra Pradesh" 1996  -11.32392 30677.93587907054
              "Andhra Pradesh" 1997  -9.693542 32361.78244014454
              "Andhra Pradesh" 1998 -11.398554 31134.05684369555
              "Andhra Pradesh" 1999 -11.829152 34756.41750442585
              "Andhra Pradesh" 2000 -10.897416 35900.52092402692
              "Andhra Pradesh" 2001 -11.089593 38569.73058889104
              "Andhra Pradesh" 2002 -10.518782 40056.76195405945
              "Andhra Pradesh" 2003 -10.838675 40352.30652898338
              "Andhra Pradesh" 2004 -10.756035 43794.12090939667
              "Andhra Pradesh" 2005 -11.007873 46456.34920634921
              "Andhra Pradesh" 2006  -11.11849 48639.66697790228
              "Andhra Pradesh" 2007 -11.201907 53324.85216308746
              "Andhra Pradesh" 2008 -11.838274 59445.30034235917
              "Andhra Pradesh" 2009 -12.057322 60368.73638344227
              "Andhra Pradesh" 2010  -12.33189 63847.72798008092
              "Andhra Pradesh" 2011  -12.10056 67482.41518829754
              "Andhra Pradesh" 2012  -12.31145             69000
              "Andhra Pradesh" 2013  -13.37815             68865
              "Andhra Pradesh" 2014 -12.048196             72254
              "Andhra Pradesh" 2015 -11.541812             79174
              "Andhra Pradesh" 2016 -11.729563             87217
              "Andhra Pradesh" 2017          .             96374
              "Andhra Pradesh" 2018          .                 .
              "Bihar"          1991          .                 .
              "Bihar"          1992 -10.255058 12218.36028543093
              "Bihar"          1993  -11.14782 11245.31439844638
              "Bihar"          1994  -9.957754 10349.75994861536
              "Bihar"          1995 -11.426145  11266.4821831157
              "Bihar"          1996 -11.496502 9296.722140211623
              "Bihar"          1997  -10.88179 11375.53464223842
              "Bihar"          1998 -10.863697 10564.45697751321
              "Bihar"          1999  -13.38827 10939.32480574755
              "Bihar"          2000  -7.953008 11184.69283877366
              "Bihar"          2001  -8.633653 12669.28393800943
              "Bihar"          2002 -9.1471615 11586.76959481668
              "Bihar"          2003  -7.976213 12870.32231603094
              "Bihar"          2005          . 13090.69130732375
              "Bihar"          2006  -8.057206 12551.44877937486
              "Bihar"          2007  -7.651774 14488.42117271275
              "Bihar"          2008  -7.283406 15002.85192790326
              "Bihar"          2009  -7.487462 17032.45493953913
              "Bihar"          2010  -8.287864 17591.54688569473
              "Bihar"          2011  -8.691915 19998.28884325804
              "Bihar"          2012  -9.184481             21750
              "Bihar"          2013 -10.542852             22201
              "Bihar"          2014  -8.625603             22776
              "Bihar"          2015   -8.93869             23223
              "Bihar"          2016  -9.523295             23987
              "Bihar"          2017          .             25950
              "Bihar"          2018          .                 .
              end
              ​​​​​​​

              Comment


              • #8
                Thanks for the data example.

                Unfortunately, the code in #5 doesn't resemble any code in the paper in #3. If you are going to use egen differently, the syntax still needs to be compatible with that explained in the help.

                But from your example in particular it now seems that you want to use the first non-missing value for each state. It would seem a good idea to record systematically when this occurs. In fact we can turn that round:

                Code:
                egen whenfirst = min(cond(!missing(percapitaNDP), Year, .)), by(State)
                
                egen first = total(cond(Year == whenfirst, percapitaNDP, .)), by(State)
                
                tabdisp State, c(whenfirst first)
                
                ---------------------------------------
                         State |  whenfirst       first
                ---------------+-----------------------
                Andhra Pradesh |       1992    30876.13
                         Bihar |       1992    12218.36
                ---------------------------------------
                
                .
                PS in case that looks too awkward, here is another way to do it:


                Code:
                gen ismissing= missing(percapitaNDP) 
                
                bysort State (ismissing Year) : gen whenfirst = Year[1] if percapitaNDP[1] < . 
                
                by State: gen first = percapitaNDP[1]
                Yet another way would be using the first() function in egenmore (SSC).

                Last edited by Nick Cox; 04 Jun 2019, 10:42.

                Comment


                • #9
                  Thanks a ton Nick.
                  That worked perfectly. for what I needed

                  Comment

                  Working...
                  X