Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Rolling 5 year correlation coefficient

    Dear Statalisters,

    My panel data is organized by industry , firm, year, quarter. I present the first 40 observations below. I want to compute the correlation coefficient on a rolling 5-year(20 quarter) basis for each firm, then take the mean within each industry year. I only used pwcorr before, but never on a rolling basis.

    Could you help?

    gvkey: firm identifier
    sic: industry classification
    fyear: fiscal year
    qtr: quarter 1,2,3,4
    correlation to be computed between x1 and x2


    clear
    input int gvkey byte sic int fyear byte qtr double(x1 x2)
    1000 30 2008 1 16.52 112.59
    1000 30 2008 2 16.16 110.87
    1000 30 2008 3 15.9 102.54
    1000 30 2008 4 16.200001 103.84
    1000 30 2009 1 16.139999 104.49
    1000 30 2009 2 16.120001 104.82
    1000 30 2009 3 16.190001 103.66
    1000 30 2009 4 15.64 102.82
    1000 30 2010 1 15.52 103.38
    1000 30 2010 2 15.52 103.24
    1000 30 2010 3 15.58 102.76
    1000 30 2010 4 15.75 102.95
    1000 30 2011 1 15.75 101.77
    1000 30 2011 2 15.69 101.9
    1000 30 2011 3 15.69 101.17
    1000 30 2011 4 15.38 99.790001
    1000 30 2012 1 15.55 98.779999
    1000 30 2012 2 15.58 98.529999
    1000 30 2012 3 15.35 97.480003
    1000 30 2012 4 15.52 95.959999
    1001 30 2008 1 185.5 111.1
    1001 30 2008 2 183.56 108.38
    1001 30 2008 3 179.7 102.08
    1001 30 2008 4 185.51 103.42
    1001 30 2009 1 185.74001 104.15
    1001 30 2009 2 185.17999 103.41
    1001 30 2009 3 184.96001 101.6
    1001 30 2009 4 179.51 102.43
    1001 30 2010 1 180.97 102.63
    1001 30 2010 2 180.23 102.51
    1001 30 2010 3 179.19 102.27
    1001 30 2010 4 181.07001 101.77
    1001 30 2011 1 182.23 100.34
    1001 30 2011 2 180.32001 101.16
    1001 30 2011 3 180.7 100.2
    1001 30 2011 4 177.01 97.129997
    1001 30 2012 1 176.02 97.669998
    1001 30 2012 2 173.76 97.760002
    1001 30 2012 3 171.86 96.809998
    1001 30 2012 4 173.02 95.779999
    . . . . . .
    end
    [/CODE]


    Regards,
    Rochelle



  • #2
    First, you need to tell us what the 20-quarter rolling window looks like:for each observation do you want a window that starts with that quarter, or ends there, or is centered there, or something else? In any case, since you provide only 20 observations for each gvkey in your example, your rolling correlation will only be defined for a single observation in each gvkey, so there is nothing much to take means of.

    Assuming you want the window to start with the quarter in the observation, and assuming that your real data contains many more observations per gvkey than you show here, then I think you want something like this:

    Code:
     gen long quarterly_date = yq(fyear, qtr) // CREATE A STATA QUARTERLY DATE
     
     tsset gvkey quarterly_date // -tsset- THE DATA SO YOU CAN USE -rolling-
     
     tempfile results // TO HOLD THE RESULTS FROM -rolling-
     
     rolling correlation=r(rho), window(20) saving(`results'): corr x1 x2
     clonevar start = quarterly_date
     merge 1:1 gvkey start using `results'
     by gvkey fyear, sort: egen mean_correlation = mean(correlation)
    but you may have to play with this a little to get exactly what you want.

    Comment


    • #3
      Thank you Clyde !

      My real data has much more observations. I added some data points below. using the new data, the first correlation will be calculated for (Q1 2008-Q4 2012) and record in the row of 2013 quarter 1 , then rolling 1 quarter forward, the second correlation using Q22008-Q1 2013 will be calculated for 2013 quarter 2. The mean will be taken for firms in the same SIC group and same quarter, my data shown has 2 firms with same SIC.

      Please let me know if you have other questions.


      clear
      input int gvkey byte sic int fyear byte qtr double(x1 x2)
      1000 30 2008 1 16.52 112.59
      1000 30 2008 2 16.16 110.87
      1000 30 2008 3 15.9 102.54
      1000 30 2008 4 16.200001 103.84
      1000 30 2009 1 16.139999 104.49
      1000 30 2009 2 16.120001 104.82
      1000 30 2009 3 16.190001 103.66
      1000 30 2009 4 15.64 102.82
      1000 30 2010 1 15.52 103.38
      1000 30 2010 2 15.52 103.24
      1000 30 2010 3 15.58 102.76
      1000 30 2010 4 15.75 102.95
      1000 30 2011 1 15.75 101.77
      1000 30 2011 2 15.69 101.9
      1000 30 2011 3 15.69 101.17
      1000 30 2011 4 15.38 99.790001
      1000 30 2012 1 15.55 98.779999
      1000 30 2012 2 15.58 98.529999
      1000 30 2012 3 15.35 97.480003
      1000 30 2012 4 15.52 95.959999
      1000 30 2013 1 15.98 103.66
      1000 30 2013 2 16.030001 102.82
      1000 30 2013 3 15.72 103.38
      1000 30 2013 4 15.55 103.24
      1001 30 2008 1 185.5 111.1
      1001 30 2008 2 183.56 108.38
      1001 30 2008 3 179.7 102.08
      1001 30 2008 4 185.51 103.42
      1001 30 2009 1 185.74001 104.15
      1001 30 2009 2 185.17999 103.41
      1001 30 2009 3 184.96001 101.6
      1001 30 2009 4 179.51 102.43
      1001 30 2010 1 180.97 102.63
      1001 30 2010 2 180.23 102.51
      1001 30 2010 3 179.19 102.27
      1001 30 2010 4 181.07001 101.77
      1001 30 2011 1 182.23 100.34
      1001 30 2011 2 180.32001 101.16
      1001 30 2011 3 180.7 100.2
      1001 30 2011 4 177.01 97.129997
      1001 30 2012 1 176.02 97.669998
      1001 30 2012 2 173.76 97.760002
      1001 30 2012 3 171.86 96.809998
      1001 30 2012 4 173.02 95.779999
      1001 30 2013 1 178.01 101.77
      1001 30 2013 2 181.7 101.9
      1001 30 2013 3 180.55 101.17
      1001 30 2013 4 186 99.790001
      end


      Best,
      Rochelle

      Comment


      • #4
        OK, that's more usable. The code I gave in #2 works with this data, with just some modification to line up the correlations in the way you describe in #3:

        Code:
         gen long quarterly_date = yq(fyear, qtr) // CREATE A STATA QUARTERLY DATE
         
         tsset gvkey quarterly_date // -tsset- THE DATA SO YOU CAN USE -rolling-
         
         tempfile results // TO HOLD THE RESULTS FROM -rolling-
         
         rolling correlation=r(rho), window(20) saving(`results'): corr x1 x2
        
        //  ALIGN THE CORRELATIONS WITH THE ORIGINAL DATA IN THE DESIRED WAY
         gen end = quarterly_date -1
         merge 1:1 gvkey end using `results'
         sort gvkey fyear quarter
         by gvkey fyear: egen mean_correlation = mean(correlation)

        Comment


        • #5
          I benefited a great deal from your post. Thank you Clyde very much !

          Best,
          Rochelle

          Comment


          • #6
            Dear Statalisters,

            In the earlier posts, Clyde has helped me solve my initial request.

            Further literature review suggests that I need to impose one restriction :

            I am still computing the correlation using past 20 quarters and rolling , However, my data for x1 or x2 variables could have missing values, my new restriction is, x1, and x2 must have at least 10 non-missing out of the 20 prior quarters .

            my thought is
            1. create a counter of x1 and x2 when they are non-missing,
            2. apply Clyde code as in #4
            3. reset correlation to missing if either counter is less than 10.

            am I on the right track?

            to set the counter, should

            Code:
            gen countX1=(x1!=.)
            bysort gvkey quarterly_date: egen sum_countx1=_N
            my restriction is sum_countx1 is greater or equal to 10.

            Please share your thoughts .

            Regards,
            Rochelle

            Comment


            • #7
              No need to go to that trouble. You just need a small modification to the -rolling- command:

              Code:
              rolling correlation=r(rho), window(20) saving(`results') reject(r(N)<10): corr x1 x2

              Comment


              • #8
                Thank you Clyde very much !

                Regards,
                Rochelle

                Comment


                • #9
                  Dear all,

                  Clyde has helped me out with my initial request - compute the correlation between two variables using past 20 quarters data.

                  When I made initial post my panel data (from 1969 to 2011), now I have obtained 2 additional year data 2012, 2013. Given that I have large number of firms , it took 3 days computer time to run the rolling part ending 2011. Hence, I do not want to rerun the whole thing.

                  My question is : if I just want to add the correlation for data year 2012, 2013 - still rolling past 20 quarters. Would it be correct to do:
                  1. from main data, keep only observations from 2007 to 2013, then run the rolling and save the results
                  2. append the results to my previous results using data from 1969 to 2011.
                  Best,
                  Rochelle


                  Comment


                  • #10
                    That sounds like it would work. Also seems to involve some duplication of work. Twenty quarters is five years, so it seems you would only need 2009-2013, or perhaps 2008-2013 if the five year window does not include the current year.

                    Oops! Sorry--I misread your question You will need to go back to 2007 so you can have the full window for 2012 as well as 2013.

                    So the short answer is that your proposed process looks just right!
                    Last edited by Clyde Schechter; 20 Nov 2015, 11:15.

                    Comment


                    • #11
                      Thank you Clyde. Happy Thanksgiving to you!!!

                      Comment


                      • #12
                        Hi Statalist,

                        I applied the same code as provided by Clyde Schechter but I dont understand why rho are almost equal to -1.

                        Code:
                            tempfile results // TO HOLD THE RESULTS FROM -rolling-
                            rolling correlation=r(rho), window(12) saving(rho): corr y1 y2
                            sum correlation
                        Here is my data

                        Code:
                        * Example generated by -dataex-. To install: ssc install dataex
                        clear
                        input double id float(quarter y1 y2)
                        1020180 164   .262143  .737857
                        1020180 165  .2712289 .7287711
                        1020180 166 .24968784 .7503121
                        1020180 167 .27641085 .7235892
                        1020180 168 .25933954 .7406604
                        1020180 169 .23800395  .761996
                        1020180 170 .24901932 .7509807
                        1020180 171 .28490818 .7150918
                        1020180 172 .26971242 .7302876
                        1020180 173  .2881684 .7118316
                        1020180 174 .29338363 .7066164
                        1020180 175 .26984176 .7301583
                        1020180 176 .26856536 .7314346
                        1020180 177 .27188843 .7281116
                        1020180 178 .25649077 .7435092
                        1020180 179 .25739437 .7426056
                        1020180 180  .2427514 .7572486
                        1020180 181 .24568537 .7543146
                        1020180 182  .2736506 .7263494
                        1020180 183 .27449584 .7255042
                        1020180 184  .2421694 .7578306
                        1020180 185 .26658094 .7334191
                        1020180 186  .2541827 .7458173
                        1020180 187  .2608258 .7391742
                        1020180 188 .26583955 .7341604
                        1020180 189  .2701865 .7298135
                        1020180 190 .14650434 .8534957
                        1020180 191 .27222726 .7277728
                        1020180 192 .24551456 .7544854
                        1020180 193  .2421071 .7578929
                        1020180 194 .23432806  .765672
                        1020180 195 .25482777 .7451723
                        1020180 196 .25590694 .7440931
                        1020180 197  .2690792 .7309208
                        1020180 198  .2262483 .7737517
                        1020180 199  .2649642 .7350358
                        1020180 200 .22126693 .7787331
                        1020180 201 .21736264 .7826374
                        1020180 202 .14526846 .8547316
                        1020180 203  .2922693 .7077307
                        1020180 204  .2383585 .7616415
                        1020180 205  .2209213 .7790787
                        1020180 206 .21813205 .7818679
                        1020180 207 .27288008 .7271199
                        1020180 208 .23760293 .7623971
                        1020180 209 .25779882 .7422012
                        1020180 210 .25996602  .740034
                        1020180 211 .29639986 .7036001
                        1020180 212 .27385888 .7261411
                        1020180 213 .27272448 .7272756
                        1020180 214 .27136242 .7286376
                        1020180 215  .2669176 .7330824
                        1020180 216 .24107453 .7589254
                        1020180 217  .2415024 .7584976
                        1020180 218 .24998806 .7500119
                        1020180 219  .2593074 .7406926
                        1020180 220  .2630771 .7369229
                        1020180 221 .25820652 .7417935
                        1020180 222 .25639817 .7436018
                        1020180 223  .2879906 .7120094
                        1020201 164 .27875888 .7212411
                        1020201 165   .284028  .715972
                        1020201 166  .3155672 .6844327
                        1020201 167  .3022131 .6977869
                        1020201 168 .29126522 .7087348
                        1020201 169  .2587162 .7412838
                        1020201 170 .28428718 .7157128
                        1020201 171  .3014226 .6985774
                        1020201 172  .3077035 .6922966
                        1020201 173 .34434515 .6556548
                        1020201 174 .23772295 .7622771
                        1020201 175  .3115353 .6884647
                        1020201 176  .3053937 .6946063
                        1020201 177  .3009453 .6990547
                        1020201 178  .3315779  .668422
                        1020201 179  .2992953 .7007047
                        1020201 180  .3699142 .6300858
                        1020201 181  .3086658 .6913342
                        1020201 182  .3907802 .6092198
                        1020201 183  .4141061 .5858939
                        1020340 164  .3598601 .6401399
                        1020340 165  .3422862 .6577138
                        1020340 166  .3531061 .6468939
                        1020340 167  .3221951 .6778048
                        1020340 168  .3475783 .6524217
                        1020340 169  .3516529 .6483471
                        1020340 170  .3528572 .6471428
                        1020340 171 .35796764 .6420324
                        1020340 172  .3566739 .6433261
                        1020340 173  .3644458 .6355542
                        1020340 174 .39076215 .6092378
                        1020340 175   .350664  .649336
                        1020340 176 .37547415 .6245258
                        1020340 177  .3555829 .6444172
                        1020340 178  .3660735 .6339265
                        1020340 179  .3495784 .6504216
                        1020340 180  .3556277 .6443723
                        1020340 181  .3599536 .6400464
                        1020340 182 .34035525 .6596447
                        1020340 183  .3177884 .6822116
                        end
                        format %tq quarter
                        Anyone could help me?

                        Many thanks,

                        ​​​​​​​David

                        Comment


                        • #13
                          Have you tried this?

                          Code:
                          corr y1 y2
                          scatter y2 y1
                          It seems the correlation is indeed -1.

                          By the way, even if it's almost obvious, you need this before you can even call the -rolling- command:

                          Code:
                          tsset id quarter

                          Comment


                          • #14
                            Thanks Jean-Claude Arbaut. It is indeed -1.

                            Comment


                            • #15
                              Here's another way to do it, with rangestat (SSC). That saves results alongside the current dataset, which is often convenient.

                              For more information, see https://www.statalist.org/forums/for...s-within-range or search the forum for mentions of rangestat.

                              I've used the data example helpfully given in #12:

                              Code:
                              rangestat (corr) y1 y2, int(quarter 0 11) by(id) 
                              
                              list, sepby(id)
                              
                                   +-------------------------------------------------------------+
                                   |      id   quarter         y1         y2   corr_n~s   corr_x |
                                   |-------------------------------------------------------------|
                                1. | 1020180    2001q1    .262143    .737857         12       -1 |
                                2. | 1020180    2001q2   .2712289   .7287711         12       -1 |
                                3. | 1020180    2001q3   .2496878   .7503121         12       -1 |
                                4. | 1020180    2001q4   .2764108   .7235892         12       -1 |
                                5. | 1020180    2002q1   .2593395   .7406604         12       -1 |
                                6. | 1020180    2002q2    .238004    .761996         12       -1 |
                                7. | 1020180    2002q3   .2490193   .7509807         12       -1 |
                                8. | 1020180    2002q4   .2849082   .7150918         12       -1 |
                                9. | 1020180    2003q1   .2697124   .7302876         12       -1 |
                               10. | 1020180    2003q2   .2881684   .7118316         12       -1 |
                               11. | 1020180    2003q3   .2933836   .7066164         12       -1 |
                               12. | 1020180    2003q4   .2698418   .7301583         12       -1 |
                               13. | 1020180    2004q1   .2685654   .7314346         12       -1 |
                               14. | 1020180    2004q2   .2718884   .7281116         12       -1 |
                               15. | 1020180    2004q3   .2564908   .7435092         12       -1 |
                               16. | 1020180    2004q4   .2573944   .7426056         12       -1 |
                               17. | 1020180    2005q1   .2427514   .7572486         12       -1 |
                               18. | 1020180    2005q2   .2456854   .7543146         12       -1 |
                               19. | 1020180    2005q3   .2736506   .7263494         12       -1 |
                               20. | 1020180    2005q4   .2744958   .7255042         12       -1 |
                               21. | 1020180    2006q1   .2421694   .7578306         12       -1 |
                               22. | 1020180    2006q2   .2665809   .7334191         12       -1 |
                               23. | 1020180    2006q3   .2541827   .7458173         12       -1 |
                               24. | 1020180    2006q4   .2608258   .7391742         12       -1 |
                               25. | 1020180    2007q1   .2658395   .7341604         12       -1 |
                               26. | 1020180    2007q2   .2701865   .7298135         12       -1 |
                               27. | 1020180    2007q3   .1465043   .8534957         12       -1 |
                               28. | 1020180    2007q4   .2722273   .7277728         12       -1 |
                               29. | 1020180    2008q1   .2455146   .7544854         12       -1 |
                               30. | 1020180    2008q2   .2421071   .7578929         12       -1 |
                               31. | 1020180    2008q3   .2343281    .765672         12       -1 |
                               32. | 1020180    2008q4   .2548278   .7451723         12       -1 |
                               33. | 1020180    2009q1   .2559069   .7440931         12       -1 |
                               34. | 1020180    2009q2   .2690792   .7309208         12       -1 |
                               35. | 1020180    2009q3   .2262483   .7737517         12       -1 |
                               36. | 1020180    2009q4   .2649642   .7350358         12       -1 |
                               37. | 1020180    2010q1   .2212669   .7787331         12       -1 |
                               38. | 1020180    2010q2   .2173626   .7826374         12       -1 |
                               39. | 1020180    2010q3   .1452685   .8547316         12       -1 |
                               40. | 1020180    2010q4   .2922693   .7077307         12       -1 |
                               41. | 1020180    2011q1   .2383585   .7616415         12       -1 |
                               42. | 1020180    2011q2   .2209213   .7790787         12       -1 |
                               43. | 1020180    2011q3    .218132   .7818679         12       -1 |
                               44. | 1020180    2011q4   .2728801   .7271199         12       -1 |
                               45. | 1020180    2012q1   .2376029   .7623971         12       -1 |
                               46. | 1020180    2012q2   .2577988   .7422012         12       -1 |
                               47. | 1020180    2012q3    .259966    .740034         12       -1 |
                               48. | 1020180    2012q4   .2963999   .7036001         12       -1 |
                               49. | 1020180    2013q1   .2738589   .7261411         12       -1 |
                               50. | 1020180    2013q2   .2727245   .7272756         11       -1 |
                               51. | 1020180    2013q3   .2713624   .7286376         10       -1 |
                               52. | 1020180    2013q4   .2669176   .7330824          9       -1 |
                               53. | 1020180    2014q1   .2410745   .7589254          8       -1 |
                               54. | 1020180    2014q2   .2415024   .7584976          7       -1 |
                               55. | 1020180    2014q3   .2499881   .7500119          6       -1 |
                               56. | 1020180    2014q4   .2593074   .7406926          5       -1 |
                               57. | 1020180    2015q1   .2630771   .7369229          4       -1 |
                               58. | 1020180    2015q2   .2582065   .7417935          3       -1 |
                               59. | 1020180    2015q3   .2563982   .7436018          2       -1 |
                               60. | 1020180    2015q4   .2879906   .7120094          1        . |
                                   |-------------------------------------------------------------|
                               61. | 1020201    2001q1   .2787589   .7212411         12       -1 |
                               62. | 1020201    2001q2    .284028    .715972         12       -1 |
                               63. | 1020201    2001q3   .3155672   .6844327         12       -1 |
                               64. | 1020201    2001q4   .3022131   .6977869         12       -1 |
                               65. | 1020201    2002q1   .2912652   .7087348         12       -1 |
                               66. | 1020201    2002q2   .2587162   .7412838         12       -1 |
                               67. | 1020201    2002q3   .2842872   .7157128         12       -1 |
                               68. | 1020201    2002q4   .3014226   .6985774         12       -1 |
                               69. | 1020201    2003q1   .3077035   .6922966         12       -1 |
                               70. | 1020201    2003q2   .3443452   .6556548         11       -1 |
                               71. | 1020201    2003q3   .2377229   .7622771         10       -1 |
                               72. | 1020201    2003q4   .3115353   .6884647          9       -1 |
                               73. | 1020201    2004q1   .3053937   .6946063          8       -1 |
                               74. | 1020201    2004q2   .3009453   .6990547          7       -1 |
                               75. | 1020201    2004q3   .3315779    .668422          6       -1 |
                               76. | 1020201    2004q4   .2992953   .7007047          5       -1 |
                               77. | 1020201    2005q1   .3699142   .6300858          4       -1 |
                               78. | 1020201    2005q2   .3086658   .6913342          3       -1 |
                               79. | 1020201    2005q3   .3907802   .6092198          2       -1 |
                               80. | 1020201    2005q4   .4141061   .5858939          1        . |
                                   |-------------------------------------------------------------|
                               81. | 1020340    2001q1   .3598601   .6401399         12       -1 |
                               82. | 1020340    2001q2   .3422862   .6577138         12       -1 |
                               83. | 1020340    2001q3   .3531061   .6468939         12       -1 |
                               84. | 1020340    2001q4   .3221951   .6778048         12       -1 |
                               85. | 1020340    2002q1   .3475783   .6524217         12       -1 |
                               86. | 1020340    2002q2   .3516529   .6483471         12       -1 |
                               87. | 1020340    2002q3   .3528572   .6471428         12       -1 |
                               88. | 1020340    2002q4   .3579676   .6420324         12       -1 |
                               89. | 1020340    2003q1   .3566739   .6433261         12       -1 |
                               90. | 1020340    2003q2   .3644458   .6355542         11       -1 |
                               91. | 1020340    2003q3   .3907622   .6092378         10       -1 |
                               92. | 1020340    2003q4    .350664    .649336          9       -1 |
                               93. | 1020340    2004q1   .3754742   .6245258          8       -1 |
                               94. | 1020340    2004q2   .3555829   .6444172          7       -1 |
                               95. | 1020340    2004q3   .3660735   .6339265          6       -1 |
                               96. | 1020340    2004q4   .3495784   .6504216          5       -1 |
                               97. | 1020340    2005q1   .3556277   .6443723          4       -1 |
                               98. | 1020340    2005q2   .3599536   .6400464          3       -1 |
                               99. | 1020340    2005q3   .3403552   .6596447          2       -1 |
                              100. | 1020340    2005q4   .3177884   .6822116          1        . |
                                   +-------------------------------------------------------------+
                              rangestat is a great program (declaration of interest), but the substantive point is different.

                              As it's evident that y1 + y2 = 1 (with occasional very minor deviations easily understand as minute precision difficulties) it follows that y1 = 1 - y2 or y2 = 1 - y1 and there is a perfect inbuilt negative correlation.

                              Comment

                              Working...
                              X