Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Rolling problems: get decomposed R2, also called Shapley value

    Dear all

    I have time-series data. I want to run rolling regressions and obtain the decomposed R2 (also called Shapley values)

    I can easily obtain the slope coefficients as follows.

    Code:
    rolling _b , window (40) saving (betas, replace) keep(fqdate):reg rgdp x1 x2
    However, to get the decomposed R2, the commands that can be used would be rego, shapley, shapley2, among others.

    The problem is that I do not know how rego, for example, saves the decomposed R2. I did the following to investigate further

    Code:
    . rego gdp x1 x2, noperc
     
    ------------------------------------------------------------------------------
    Gr        Regressor |       Coef.      Std.Err.   P>|t|  Std.Coef.  Shapley R2
    --------------------+---------------------------------------------------------
     1               x1 |   -.1608338 *    .0853014   0.061    -0.1375      0.0191
     2               x2 |     .056379 *    .0308584   0.069     0.1333      0.0179
     -        Intercept |    .0256372      .0061803   0.000     
    --------------------+---------------------------------------------------------
           Observations |         184
             Overall R2 |     0.03703
               Root MSE |    .0799205
          F-stat. Model |    3.480318 **              0.033
         Log Likelihood |    205.3446
    ------------------------------------------------------------------------------
    
    . ereturn list
    
    scalars:
                      e(N) =  184
                   e(df_m) =  2
                   e(df_r) =  181
                      e(F) =  3.480318446798436
                     e(r2) =  .0370324181096345
                   e(rmse) =  .0799205236317307
                    e(mss) =  .0444596071032519
                    e(rss) =  1.156099507660175
                   e(r2_a) =  .0263918923428902
                     e(ll) =  205.3446297734195
                   e(ll_0) =  201.8729608831572
                   e(rank) =  3
                 e(noperc) =  1
    
    macros:
                e(cmdline) : "rego gdp x1 x2, noperc"
             e(regressors) : " x1 x2"
                    e(cmd) : "rego"
                  e(title) : "Linear regression"
              e(marginsok) : "XB default"
                    e(vce) : "ols"
                 e(depvar) : "gdp"
             e(properties) : "b V"
                e(predict) : "regres_p"
                  e(model) : "ols"
              e(estat_cmd) : "regress_estat"
    
    matrices:
                      e(b) :  1 x 3
                      e(V) :  3 x 3
           e(shapley_perc) :  1 x 2
                e(shapley) :  1 x 2
          e(group_details) :  2 x 1
         e(vars_per_group) :  2 x 1
                   e(stdb) :  1 x 3
    
    functions:
                 e(sample)

    Then I thought to try something like:

    Code:
    . rolling _b decomposed=e(shapley_perc), window (40) saving (betas, replace) keep(fqdate):rego gdp x1 x2, noperc
    (running rego on estimation sample)
    type mismatch
    error in expression: e(shapley_perc)
    r(109);
    However, I get an error message. I hope someone can help me get two variables each representing the decomposed R2 for each regressor.


  • #2
    The immediate problem is apparent confusion between a name and the value(s) it contains, but i suspect the underlying problem is different.

    e(shapley_perc) is a (1 x 2) row vector and won't fit anywhere that expects a scalar. I guess you need to write your own program to call up rego and emit the two values in that row vector as two separate scalars.

    Comment


    • #3
      Thank you. It is challenging to me to write my own program. Can I get help for that?

      Alternatively, is there another easier way to get the decompose R2 without using rego?

      Comment


      • #4
        Your second question seems all of a circle with your first post, in which you mentioned several commands in this territory. It's likely that some people here are also familiar with them.

        Comment


        • #5
          I did not mean to confuse the participants, I wanted to clarify that I am fine with any command that can do the required task.

          It will be very helpful if I get some help either by programming it as you suggested or with another command that may directly give the decomposed R2 as I simply get the slope coeff. Thanks.

          I look forward to getting some help

          Comment


          • #6
            Dear all
            I am bringing this up as I am still not able to program it and it is still unresolved. I hope someone can provide any solution. Thanks

            Comment


            • #7
              Dear Mike, Please check the following to see if it fits your need. You need to "ssc install rangestat" and "ssc install rangerun". (for others, please visit http://www.marco-sunder.de/stata/rego.html to "ssc install rego" as well)
              Code:
              webuse grunfeld, clear
              gen t = _n
              tsset t
              
              /*
              keep if inrange(t,2,11)
              rego invest mvalue kstock, noperc
              mat A = e(shapley_perc)
              mat list A
              *mat B=e(shapley)
              *mat list B
              */
              
              cap program drop myr2
              program define myr2
                 rego invest mvalue kstock, noperc
                 mat A = e(shapley_perc)
                 gen mvalue_r2 = A[1,1]
                 gen kstock_r2 = A[1,2]
              end
              
              rangerun myr2, interval(t -10 0) use(invest mvalue kstock)
              Ho-Chuan (River) Huang
              Stata 19.0, MP(4)

              Comment


              • #8
                Dear River

                Thank you. I tried to apply the code you suggested but I encountered some issues.

                First, let me show you how I adapted it to my dataset.

                Code:
                gen t = _n
                tsset t
                
                cap program drop myr2
                program define myr2
                   rego growth fin liq, noperc
                   mat A = e(shapley)
                   gen fin_r2 = A[1,1]
                   gen liq_r2 = A[1,2]
                end
                
                rangerun myr2, interval(t -39 0) use(growth fin liq)
                
                tsset fqdate
                As you see here I removed the middle part of your code because it was starred (i.e., /*) so I assumed that it was not relevant. Am I right here?
                I have also used e(shapley) rather than e(shapley_perc).

                I also want to do the rolling estimation for a fixed window of 40 quarters. Therefore, I changed the interval to - interval (t -39 0) -

                Unfortunately, the code seems to produce decomposed r2 starting from t=13, rather than t=40. I do not know why 13?! This makes me less confident in the way I used it. Moreover, the decomposed r2 for the first part of the sample are relatively less reasonable.

                I appreciate your assistance with that.

                Here is how the original data and the related program outcome look like

                Code:
                * Example generated by -dataex-. To install: ssc install dataex
                clear
                input float(fqdate t growth fin liq fin_r2 liq_r2)
                 40   1   -.15912275           .           .          .          .
                 41   2    .09266312           .           .          .          .
                 42   3    .06238146           .           .          .          .
                 43   4     .1003086           .           .          .          .
                 44   5    .04127723           .           .          .          .
                 45   6   .007405024           .           .          .          .
                 46   7   -.05980303           .           .          .          .
                 47   8    .15197453           .           .          .          .
                 48   9   .033147737           .           .          .          .
                 49  10    .02404358  .003004028   .13902617          .          .
                 50  11    .04470645  .017605728   .13934128          .          .
                 51  12   -.04389818  .037124224    .1701323          .          .
                 52  13   -.06813001   .06897428    .1787488   .3995927   .5796248
                 53  14   .017472947   .06142432    .2004503  .33883545  .14147584
                 54  15   -.06573346  .022267703    .2576679  .11761589   .3432416
                 55  16   .013654263 -.003605632   .22521473  .21383546   .2252033
                 56  17   -.10910688  -.06608154   .17140234  .08750177  .08972315
                 57  18   -.17863147  -.08187396   .26223686  .29010087   .2376845
                 58  19    .02453373  -.09342206    .3075926  .08464367  .02872864
                 59  20     .1825736   -.1601285   .12021548   .0828617  .21213877
                 60  21    .13831255  -.26139924   .14585444  .20547506  .22455414
                 61  22   -.03320964  -.23504017   .10323387  .10111359  .13996103
                 62  23    .04898705 -.002510715  -.04422319  .10402494  .14277646
                 63  24    .11328086    .1486354   -.0732529  .04327061  .25656182
                 64  25    .01889895   .23440973   .08816886 .035389356   .2577427
                 65  26    .03804962   .24020024   .04046756  .03052545   .2645839
                 66  27  .0012684565   .13558552  .029505186 .035807848  .24719878
                 67  28    .00506812   .05017933   .14525767 .035623368   .2470179
                 68  29  -.016614793  .019529186   .09068041 .034996923   .2358136
                 69  30   .021917466   .05613389    .0795579 .034848787   .2369592
                 70  31   .005860911    .0447095    .0826571  .03513549  .23384534
                 71  32   -.06374294  .029931117   .05924428  .03226453  .19186123
                 72  33    .14155625   .03515898   .03861482 .029876167  .22135384
                 73  34     .0861522   .01499957   .13049664  .02829714  .21057357
                 74  35    -.0784217  -.02943926   .19715326  .02585542   .2268457
                 75  36    .03825453 -.068316035    .3172009 .024697393   .1735157
                 76  37    .05727483  -.07546862    .4038011  .02062175   .0990404
                 77  38    .12533192  -.07410551    .2871936  .02327099  .05651599
                 78  39  -.005435857  -.07574596   .31919655  .02258508  .05960881
                 79  40     .0893553  -.08665448    .2480211  .02701046  .04802503
                 80  41   .003562767   -.0861227    .3039681  .02646089  .04964466
                 81  42    .12037104   -.1579977    .1694907   .0433902  .05195192
                 82  43     .1610871   -.1500074     .160122  .06737334  .05503699
                 83  44   -.05054996 -.003030176   .05643868  .06251464   .0408837
                 84  45    .02007691    .0889483  -.08150642  .06202963   .0381264
                 85  46   -.05139175    .1143476    .0291774  .07119061  .03095441
                 86  47    .05405565   .11036552 -.023045223  .06667814 .035820365
                 87  48   -.09967183  .030198524 -.028201224  .05901799 .016086401
                 88  49   .006232464  -.07330564  -.06685927  .05332542  .01126191
                 89  50    .07415348  -.09423943  -.06366187  .05985475 .015546743
                 90  51    .17100805  -.06105917  -.04679153  .07288318  .03329677
                 91  52    .08661003  -.06661268  -.09994292   .0731011  .03866554
                 92  53     .1205432  .001700371  -.01963637   .0624298  .04290305
                 93  54   .013095206    .1345399  -.03431971    .064961   .0419851
                 94  55   .022359865    .1832669  .013784097  .05903648  .03051125
                 95  56   -.05547845    .2253288 -.008925495   .0783449 .026540995
                 96  57  -.033144854   .20270093   .14826977   .1102395 .024487656
                 97  58    .11636598   .13237359    .0911391  .13162231 .008169165
                 98  59  -.005237876  -.00538572   .03201442  .13096519 .005736715
                 99  60    .13082045  -.06784122   .05515191  .10979066 .005877719
                100  61     .0543263  -.07474247   .01333667  .07678565 .005320496
                101  62   .014636938  -.10852606   .02162166  .11927792 .006586331
                102  63      .077632  -.08288988  .012553687   .1252074 .006988728
                103  64    .12550417  -.04238361   -.1024717   .1662697 .007921838
                104  65    .09166063 -.025527844  .007393745   .1771347 .010756136
                105  66    .03389942  -.04749262   .06305111  .19950123 .011976195
                106  67  -.016710065  -.03478983   .08085943  .18255995 .011771573
                107  68    .14817744  .005262164 -.010440764   .1685938 .015260354
                108  69   .006486706   .02223057   .09253103  .16887137 .015106636
                109  70    .12696432   .05426863  -.03878823  .14912684  .02024419
                110  71      -.30903   .06436848   .13415723  .14335129  .02857533
                111  72     .1709989   .04879718   .05376903    .119071  .02866489
                112  73  -.006295912  .015830511    .1819412  .13198133  .03056174
                113  74   .019477904  .006201566   .29749134  .13525817 .033688605
                114  75   .035165526 -.020003757   .11805166  .14511782  .02559841
                115  76   .068203904  -.03939807   .26523116  .14654432 .021369884
                116  77    .11065657  -.02307056   .10391206  .14664653  .02307443
                117  78    .09418894 -.029154416   .06447671  .14173546  .03800871
                118  79  -.008102294  -.02848965  -.18072435  .14000969 .014156066
                119  80  -.034057897 -.017708568  -.12534903  .12571329 .009748232
                120  81    .08513564  .012083122  -.14466582  .13349782 .007296402
                121  82    -.1077248   .02540541   -.0983429    .113062  .00307663
                122  83   .006028487 -.008655875   .03779088  .08034382 .010065364
                123  84    .15087596  -.04966408  -.08130171  .09376193 .013722532
                124  85    .06198982  -.08155394  -.04261577  .09289166 .014840459
                125  86    .02405199  -.06537376  -.11729734  .07500093 .012380051
                126  87  -.027157733 -.010727644  -.17133546  .08085533  .00515516
                127  88    .11741457  .014604875  -.18941317 .071499325 .015005595
                128  89   .007060129  .015891084  .030051116  .07991534 .017629517
                129  90  -.002882732  .003607964   .08137294   .0769908    .018887
                130  91    .05845396  .002353109   .13270248  .06618204 .013621195
                131  92    .03787044   .03889596   .08532304  .06182585 .011229084
                132  93   .030932084   .06300384   .13350117  .06191384 .009404538
                133  94    .04677445   .04686376 -.031792864  .05886136 .010121098
                134  95 -.0026416525   .03760597   .11681883 .064874835 .011130707
                135  96     .0273554   .03902677   .11752383 .036685906 .013711141
                136  97  -.016175805  .010472127   .10083447 .022227006 .013067997
                137  98    .04624455  .012761226   .13353267  .06212895 .013862181
                138  99   -.04106052  .025459345   .22085525  .06683928  .02145803
                139 100    .08386634   .02296789    .2513017  .04615335 .014755222
                end
                format %tq fqdate

                Comment


                • #9
                  Caveat: I have no familiarity with -rego-, and never heard of decomposed R2 nor Shapley statistics before. I'm commenting here only with regard to understanding the output produced by the code in #8.

                  When -rangerun- is specified with -interval(t -39 0)-, it goes through the data set observation by observation, and for each observation it assembles a data subset consisting of all those observations whose value of t is between 39 less than the current value of t and the current value itself, and then runs program myr2 on that subset. It does that even if there is only one such observation, or even none. In particular, there is no reason to expect the output to start at t = 40. In fact, the mystery to me is why it does not start showing output in the very first observation: my best guess is it's because -rego- itself requires some minimum number of observations before it will produce results, and that doesn't happen until t= 13.

                  Anyway, there is no reason to question whether -rangerun- is producing the desired rolling window. It is. If you only want to create results when that window contains a full 40 observations, then you can modify the code as follows:

                  Code:
                  gen t = _n
                  tsset t
                  
                  cap program drop myr2
                  program define myr2
                      if _N == 40 {
                         rego growth fin liq, noperc
                         mat A = e(shapley)
                         gen fin_r2 = A[1,1]
                         gen liq_r2 = A[1,2]
                      }
                  end
                  
                  rangerun myr2, interval(t -39 0) use(growth fin liq)

                  Comment

                  Working...
                  X