Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculate a variable based on rolling windows in panel

    Hi,

    I have an unbalanced panel and I need to compute variable B in year t as the variance of variable A in t-1, t-2, and t-3. In other words, I need to calculate a variable based on rolling windows of the past three years.

    I believe my starting point should be the following:

    tsset ID year
    by ID: egen B = var(A)

    However, I need to ask Stata to take into account the values of variable A only for t-1, t-2, and t-3. Any help would be much appreciated.

    Thanks,

    Giuseppe

  • #2
    You should explain the provenance of user-written commands. There is no egen var() function in built-in Stata; you appear to be using the function available via the egenmore package, downloadable from SSC. Whatever, I am unsure that I understand what you're wanting to do, but how about doing something exploiting the fact that you have tsset data:

    ge L1B = L1.B
    ge L2B = L2.B
    ge L3B = L3.B


    This will create variables containing the lagged variances. If there are no missing data, then the variance for the pooled data (pooling from the 3 lags) can be computed from the variances of each of L1B, L2B, and L3B, can't it? The computation is more complicated with an unbalanced panel because the calculation of the total variance from the component variance needs to take account of the correct number of observations available at each lag. But deft use of the lag operator with generate should allow you to create variables containing those numbers.

    Comment


    • #3
      Have you considered -rolling- ? Something like:

      Code:
      clear all
      set more off
      
      *----- example data -----
      
      input ///
      id metric
      1 23
      1 34
      1 235
      1 4663
      1 4562
      1 366
      1 485
      2 34
      2 455
      2 235
      2 2453
      2 744
      2 635
      2 646
      end
      
      bysort id: gen year = cond(id==1, 1980 + _n, 1985 + _n)
      
      order id year
      list, sepby(id)
      
      *----- what you want? -----
      
      xtset id year
      
      rolling varian = r(Var), window(3) clear: summarize metric
      
      list, sepby(id)
      See -help rolling- for details.
      You should:

      1. Read the FAQ carefully.

      2. "Say exactly what you typed and exactly what Stata typed (or did) in response. N.B. exactly!"

      3. Describe your dataset. Use list to list data when you are doing so. Use input to type in your own dataset fragment that others can experiment with.

      4. Use the advanced editing options to appropriately format quotes, data, code and Stata output. The advanced options can be toggled on/off using the A button in the top right corner of the text editor.

      Comment


      • #4
        If you need to keep the original data, then you can -merge- the results of -rolling- back to the original dataset (I did in the way you specified in your original post):

        Code:
        clear all
        set more off
        
        *----- example data -----
        
        input ///
        id metric
        1 23
        1 34
        1 235
        1 4663
        1 4562
        1 366
        1 485
        2 34
        2 455
        2 235
        2 2453
        2 744
        2 635
        2 646
        end
        
        bysort id: gen year = cond(id==1, 1980 + _n, 1985 + _n)
        
        order id year
        list, sepby(id)
        
        tempfile orig
        save "`orig'"
        
        *----- what you want? -----
        
        xtset id year
        
        rolling varian = r(Var), window(3) clear: summarize metric
        
        gen year = end + 1
        
        merge 1:1 id year using "`orig'"
        
        sort id year
        order id year
        list, sepby(id)
        For sure there are more "direct" ways of doing this. This is just one.
        You should:

        1. Read the FAQ carefully.

        2. "Say exactly what you typed and exactly what Stata typed (or did) in response. N.B. exactly!"

        3. Describe your dataset. Use list to list data when you are doing so. Use input to type in your own dataset fragment that others can experiment with.

        4. Use the advanced editing options to appropriately format quotes, data, code and Stata output. The advanced options can be toggled on/off using the A button in the top right corner of the text editor.

        Comment


        • #5
          Thanks Stephen and Roberto!

          I actually tried the rolling command and it works just fine if I want to calculate S.D. However, it creates missing values when I tried to calculate variance.

          In the case of the variance it is not actually a big deal since I simply compute Variance = S.D. ^2.

          tsset ID year

          rolling r(sd), window(3) clear: summarize A

          rename _stat_1 B_old

          rename end year

          generate B= B_old^2

          drop B_old

          The problem arises when I want to calculate the kurtosis or the skewness. Alike for the variance, STATA computes me a new variable of missing values. Any idea? This is the code I use:

          tsset ID year

          rolling r(kurt), window(3) clear: summarize A

          rename _stat_1 B

          rename end year

          Comment


          • #6
            As the help documents, summarize saves the kurtosis in r(kurtosis). A reference to r(kurt) is not illegal, but that is not defined.

            That aside, I can't see very much value in kurtosis calculated from subsamples of 3. When the value is not indeterminate, it appears to be always 1.5. It would be an amusing derivation to show why that is so.
            (EDIT: Results cited in http://www.stata-journal.com/sjpdf.h...iclenum=st0204 give an upper limit of 1.5, but that is only part of the question.)

            See also Section 18 of the FAQ Advice.

            (You don't give any code showing how you tried to calculate the variance, so diagnosis is difficult, given my limited powers of http://en.wikipedia.org/wiki/Extrasensory_perception But I guess at a similar error.)
            Last edited by Nick Cox; 11 Aug 2014, 06:30.

            Comment


            • #7
              Originally posted by Nick Cox View Post
              (You don't give any code showing how you tried to calculate the variance, so diagnosis is difficult, given my limited powers of http://en.wikipedia.org/wiki/Extrasensory_perception But I guess at a similar error.)
              Probably Giuseppe is using -r(var)- and not -r(Var)-.
              You should:

              1. Read the FAQ carefully.

              2. "Say exactly what you typed and exactly what Stata typed (or did) in response. N.B. exactly!"

              3. Describe your dataset. Use list to list data when you are doing so. Use input to type in your own dataset fragment that others can experiment with.

              4. Use the advanced editing options to appropriately format quotes, data, code and Stata output. The advanced options can be toggled on/off using the A button in the top right corner of the text editor.

              Comment


              • #8
                Thanks so much Nick and Roberto!

                Roberto, I was indeed unsing -r(var)-. With r(Var) works just fine. Thanks!

                I will now try to use r(kurtosis). Thanks for your hint and readings Nick!

                Thanks agin!

                Giuseppe
                Last edited by Giuseppe Criaco; 11 Aug 2014, 11:40.

                Comment


                • #9
                  #6 implies that kurtosis from samples of 3 tells you almost nothing about the data. The only exception is that 3 identical values will give you indeterminate kurtosis.

                  Comment


                  • #10
                    Thanks Nick,

                    I have tried to run the command for 5 years with r(kurtosis). However, it does not work (new variable with all missing values). This is the code I have used:

                    rolling r(kurtosis), window(5) clear: summarize A

                    May that be because the summarize command says for kurtosis:

                    r(kurtosis) kurtosis (detail only)

                    Thanks again!

                    Giuseppe

                    Comment


                    • #11
                      You got it. You must specify the detail option.

                      Comment

                      Working...
                      X