Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generating observations for a variable in an extension of data

    I have extended a data set that previously included quarterly data from 2001Q1 (represented as eg yq=101) to 2012- where the intervention/ treatment happens in 2009Q3, to include data from 1997-2000. As shown below, the original data has calculated values for the variable y_growth, I want to fill in the missing observations by calculating values for the variable y_growth in the extended years. Furthermore, l_growth represents the change in l from 2001Q1 to 2009Q3- how do I change this to represent the new preintervention change from 1997Q1 (represented as yq=971) to 2009Q3 (represented as yq=903) so that I have a variable l_growth which is %change from 1997Q1 to 2009Q3, with observations persisting until 2012Q4 as shown in data below.

    A simplified snapshot of my data,


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input double naic float(year yq) double l float lnl double Y float(y_growth l_growth)
    311111 1997  973  7861.999938964844  8.969796    8688.2392578125         .          .
    311111 1997  972  7517.999938964844 8.9250555    8688.2392578125         .          .
    311111 1997  971  7473.666717529297  8.919141    8688.2392578125         .          .
    311111 1997  974  7810.999908447266  8.963288    8688.2392578125         .          .
    311111 2001  101 19.290000915527344  2.959587             9734.9         .          .
    311111 2001  103  19.30699920654297 2.9604676             9734.9         .          .
    311111 2001  104 19.410999298095703   2.96584             9734.9         .          .
    311111 2001  102 19.306665420532227 2.9604504             9734.9         .          .
    311111 2008  804  18.94499969482422   2.94154           18582.01  .6464767          .
    311111 2008  801 18.977333068847656  2.943245           18582.01         .          .
    311111 2008  803  18.89433479309082  2.938862           18582.01         .          .
    311111 2008  802 18.920000076293945 2.9402196           18582.01         .          .
    311111 2009  902 19.806333541870117  2.986002          19690.966   .704443          .
    311111 2009  903 19.895999908447266  2.990519          19690.966   .704443  .03093195
    311111 2009  904 20.117000579833984  3.001565          19690.966  .6134558  .04111481
    311111 2009  901 19.645666122436523  2.977857          19690.966   .704443          .
    311111 2010 1004 20.645334243774414 3.0274894          20386.975  .6163735  .14274454
    311111 2010 1003 20.538665771484375  3.022309          20386.975  .6481915  .13019824
    311111 2010 1002 20.454666137695313  3.018211          20386.975  .6481915  .05237126
    311111 2010 1001 20.121999740600586  3.001814          20386.975  .6481915  .04134607
    311111 2011 1104  20.64266586303711   3.02736          20585.563 .52908134   .1369698
    311111 2011 1101 20.581666946411133  3.024401          20585.563  .6260676  .13451052
    311111 2011 1102 20.526334762573242  3.021709          20585.563  .6260676  .11907911
    311111 2011 1103 20.444334030151367  3.017706          20585.563  .6260676  .12992978
    311111 2012 1203  21.57699966430664  3.071628     21993.56640625 .59524155   .1981008
    311111 2012 1201 21.030000686645508   3.04595     21993.56640625 .59524155   .1559856
    311111 2012 1202 21.009666442871094  3.044983     21993.56640625 .59524155  .15448117
    326211 1997  972  17935.66650390625  9.794546    14728.525390625         .          .
    326211 1997  973              17986  9.797349    14728.525390625         .          .
    326211 1997  974 18046.000122070313  9.800679    14728.525390625         .          .
    326211 1997  971 17677.333435058594  9.780039    14728.525390625         .          .
    326211 2001  102  74.57366180419922 4.3117876 13429.300000000001         .          .
    326211 2001  101  76.86166381835938  4.342007 13429.300000000001         .          .
    326211 2001  104   73.6463394165039 4.2992744 13429.300000000001         .          .
    326211 2001  103  73.96266174316406 4.3035603 13429.300000000001         .          .
    326211 2008  802 52.915000915527344  3.968687 16203.001701816542         .          .
    326211 2008  801  52.85100173950195 3.9674766 16203.001701816542         .          .
    326211 2008  804 51.595333099365234  3.943431 16203.001701816542 .18775845          .
    326211 2008  803  52.56766891479492  3.962101 16203.001701816542         .          .
    326211 2009  902  47.66733169555664 3.8642464 14561.600911501433 .08094978          .
    326211 2009  904 45.706668853759766  3.822244 14561.600911501433 .08296967  -.4895434
    326211 2009  903  45.85499954223633  3.825484 14561.600911501433 .08094978  -.5165229
    326211 2009  901  49.75166702270508  3.907044 14561.600911501433 .08094978          .
    326211 2010 1001  44.92266845703125 3.8049426 16175.651189804325  .1880884 -.49861765
    326211 2010 1002 44.925331115722656  3.805002 16175.651189804325  .1880884  -.4942727
    326211 2010 1004 46.357330322265625 3.8363795 16175.651189804325 .14416409  -.3659372
    326211 2010 1003  45.31700134277344  3.813682 16175.651189804325  .1880884  -.3920047
    326211 2011 1103 46.349666595458984  3.836214  20063.16013991512 .35954285  -.3450012
    326211 2011 1104  46.81266784667969  3.846154  20063.16013991512  .3270779  -.3247762
    326211 2011 1102  47.05500030517578  3.851317  20063.16013991512 .35954285  -.3422291
    326211 2011 1101  46.39799880981445 3.8372564  20063.16013991512 .35954285  -.3640623
    326211 2012 1202   47.2773323059082  3.856031     20320.67578125 .33983135 -.28710365
    326211 2012 1201  45.77199935913086 3.8236725     20320.67578125 .33983135 -.33352685
    326211 2012 1203 47.198001861572266 3.8543515     20320.67578125 .33983135 -.25854206
    end

  • #2
    Please provide a clearer explanation of your variables y_growth and l_growth. You state that they represent percentages changes, but the data seem to be clearly and radically inconsistent with that. For example, looking at 2009q1 vs 2008q4, the change in Y is 19690.966 - 18.582.01 = 1108.956, which is a relative change of 5.97%, which I cannot relate to any value of y_growth.

    Please respond with a clear explanation of how y_growth and l_growth are to be calculated from Y and l, respectively.

    By the way, your yq variable to indicate quarters is going to get in the way of working with this data as longitudinal. You need to calculate a proper Stata internal format quarterly date. In this instance, I would do that by:

    Code:
     
    by naic year (yq), sort: gen qdate = yq(year, _n)
    format qdate %tq
    drop yq

    Comment


    • #3
      Sorry my mistake, for now I just need l_growth which is the average growth rate of the log of l (lnl) over the preintervention period (so from 1997Q1 to 2009Q3)

      Thanks

      Comment


      • #4
        I'm sorry, but I still don't understand what you want. I have tried several interpretations of "average growth rate of the log of l over the preintervention period" but none of them come even close to resembling the non-missing values you show in l_growth. Please provide a detailed explanation of how you calculated the non-missing values. Use mathematical formulas.

        Comment


        • #5
          Clyde, so sorry for the misunderstanding. I want to get rid of the values of l_growth altogether because these represent the average growth rate of lnl since 2001Q1 (which was the preintervention period of a dataset that I did not create hence I do not know the exact mathematical formula for this). I want to create a new variable l_growth that represents the average growth rate of lnl over the extended data's new preintervention period that spans from 1997Q1 to 2009Q3, hence this variable will have different values for the before non missing values you refer to.

          Sorry for the confusion- the first part of my question was regarding finding missing values for a variable I still wanted- but i no longer need the answer to this. I am just needing to generate a new variable for l_growth with the extended preintervention period, and will get rid of the above l_growth variable.

          If i should seek alternative help for this, as I am aware I am also essentially seeking the mathematical formula to generate this, please let me know.

          Hope this makes sense now,
          Thanks

          Comment


          • #6
            Clyde,
            I have managed to figure out where I was going wrong.
            Thanks for your help

            Comment

            Working...
            X