Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • First venture into panel data sets

    I have the following extract from my data set. Previously I have always been regressing using standard regression. The below is a panel data set I am trying to predict returns. My previous regression technique would include.
    1. Add the first variable looking for the highest R2
    2. Search for the variable that has a Vif < 1.5
    3. Add the next variable which has the highest R2 and so on.
    How do I repeat this process for a panel data set?

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float date long n_ticker double close float return double(CurrentRatio EBITDAMargin OperatingMargin ReturnonEquity ReturnonInvestment TotalDebttoEquity)
    566  3  263  -34.22053 1.153532  2.111555  1.776075  9.870937  5.660377 1.636456
    578  3  173   8.092485 1.169349  1.552444  1.980144  8.282616  6.466227 1.362266
    590  3  187   6.417112 1.163803  1.705138  2.020277   9.05641  6.837078 1.613801
    602  3  199  -11.55779  1.16347  1.489251  1.495425  6.103352  4.629433 1.678207
    614  3  176       12.5 1.118649   .398203   .975844   .330202  2.996226 2.375864
    626  3  198   6.565657 1.103763   .876928   .899469  2.485019  2.677665 2.804256
    638  3  211  24.170616 1.265924  1.236759  1.305281  7.168884  3.707988  2.29815
    650  3  262   8.396947 1.536229  1.916625   1.44031 15.564529  4.736834 2.160753
    662  3  284  -9.154929 1.503659  2.544997   1.12709 11.489965  3.720564  2.05678
    674  3  258  1037.9845 1.724648  1.290673  1.074457  7.955425  3.434221 2.214412
    686  3 2936          . 1.600863  1.755573  1.574647 10.186315  5.011671 2.046044
    662  4 1694   23.90791 1.102925  3.251739  1.005691  8.697695  2.328596 3.257762
    674  4 2099   60.55264 1.055705   3.36343  1.918263  4.706788  4.556272 3.302475
    686  4 3370          . 1.233213  4.717306  3.012499 16.621113  7.115761 2.704151
    568  5 1544   3.108808 4.520915   2.84545   .584946   .221084   .302538  .029713
    580  5 1592 -17.273869 3.980719  2.620722  2.785319    .34145  1.466508  .034199
    592  5 1317 -11.693242 4.261288 -1.439753  2.614927 -2.185296  1.459064  .045428
    604  5 1163  -2.493551 5.392828  4.299411  3.600757  1.376361  2.213193  .035874
    616  5 1134  -6.613757 5.803672  4.861656  6.854659   1.82866  4.343124  .026494
    628  5 1059  24.551464 5.296215  5.558866  5.452456  2.667078  3.501435  .025559
    640  5 1319  3.5633056 6.623578  9.020567  4.413812  4.158892  2.901829  .027469
    652  5 1366    60.2489 5.534186  7.837246  6.837654  3.375596  4.481231  .044003
    664  5 2189    20.3746 6.112259 13.121837  8.427531   4.53286  5.433738  .043286
    676  5 2635  34.155598 5.912853 16.003947 12.451296  5.925328  7.918457  .056702
    688  5 3535          . 4.951255 16.645107  12.45715  6.691592   8.02914  .044628
    566  6 2090 -15.502393  .787745 16.114812 14.404148 10.685256 14.071102   .26994
    578  6 1766  -3.227633  .653008 15.043641 16.689926 10.095966 15.560358  .260281
    590  6 1709   12.17086  .516029 17.801174 19.038618 12.575282 18.486229   .20144
    602  6 1917 -2.2430882  .631695 19.297597  19.26679 12.482496  19.17699   .16233
    614  6 1874  -6.990395  .789411 13.699246 19.268045  7.500029  19.71175  .127317
    626  6 1743   .4016064  .984651 15.954327 15.377595  9.665289 14.982499  .188781
    638  6 1750  22.685715  .788124  7.322011  4.873766  3.434883  4.182513  .260177
    650  6 2147  4.1918955  .637539  9.436212   7.46164  5.967268  7.213087  .327634
    662  6 2237    -.49173   .61631 10.016571  6.500517  7.703049  6.138543  .361273
    674  6 2226  -9.838275  .586048  7.480283   5.62087  5.201201  4.951217  .411017
    686  6 2007          .  .620024  7.113547   5.43418  5.567587  4.511496  .579442
    566 44  476 -14.705882  1.22925   3.58802  2.641761  3.636332  3.807879  .502146
    578 44  406  -.9852217  1.03647  3.555774  2.674708  3.969598  4.595811  .540068
    590 44  402  15.920398 1.172269  3.632626  3.859146  5.395953  7.665205  .575196
    602 44  466 -18.240343 1.444558  5.209759  4.797275  8.586708  9.178143  .408585
    614 44  381  -.2624672 1.261443  4.320633  4.190942  6.199939  7.593718  .454564
    626 44  380   13.94737 1.340303   3.85776  3.580162  6.233597  6.630123  .450992
    638 44  433   30.48499 1.161782  4.496641  3.583221   6.55931  6.431513  .379137
    650 44  565   5.132743 1.403239  4.727125  3.764424  6.798162  6.821531  .379116
    662 44  594   55.72391  1.46035  6.054383  2.816647  5.562771  4.912945  .349241
    674 44  925   77.72973 1.410045  6.285058  3.560656  6.243522  6.237457  .352507
    686 44 1644          . 1.443737  6.553181  3.699845  6.520192  6.449924   .27137
    566 45 1199 -11.259383 2.654359  5.666802  4.587389  4.270904  6.455836  .032553
    578 45 1064 -1.0338346 2.841104   5.27465  4.443822  4.205233    7.0551  .019531
    590 45 1053  14.624882 3.105869  5.431878  4.661742  5.366735  8.294602  .012757
    602 45 1207  -20.54681 3.164073  6.679317  5.989255  6.356786  9.912534  .011494
    614 45  959   4.379562 2.622808  6.297683  5.973038  5.144374   9.08553    .0108
    626 45 1001   27.77223 2.312688  5.710885  5.229623  4.674248  7.954516  .027214
    638 45 1279 -11.336982 2.196232  5.308342  4.772086  4.559316  7.048852  .027397
    650 45 1134   24.69136 2.393031  5.530216   4.49136  4.764084  6.829719  .030711
    662 45 1414   26.52051 2.047434  7.457844   3.89209  4.635653  5.665133  .057314
    674 45 1789  -7.154835 2.185334  7.932625  4.269797  4.741212  6.083679  .052487
    686 45 1661          . 2.658912  8.788437  4.795128   5.07379  6.365007   .03736
    566 46  341  -36.07038 2.421563  5.187278  5.221476  2.406475  3.665609  .197911
    578 46  218  1.3761468 2.841258    4.8809  4.712845  2.547605  3.794369  .152743
    590 46  221   3.167421 2.950077  4.577789  4.367336  2.702975  3.990655  .155009
    602 46  228 -18.421053  2.55526   5.47549  5.456662  3.410523  5.083435   .18604
    614 46  186 -1.0752689 2.912299  4.121053  4.018581  2.372148  3.713319  .157099
    626 46  184 -3.2608695 2.944085  3.646823  3.346989  2.186235  3.060444  .151949
    638 46  178   12.35955 3.089353  2.586399  2.383612  1.520383   2.18044  .148052
    650 46  200         -1  3.02243  3.459841  3.087964   1.91917   2.73101  .141249
    662 46  198 -3.5353534 2.789942  7.973364  3.624257  2.270784  2.960442  .155713
    674 46  191   1139.267 2.789589  8.110959  3.479584  2.069594  2.650402  .151372
    686 46 2367          . 2.809437  7.895687  3.750022  2.235074  2.804057  .145678
    566 47  438  -22.37443 1.252314  4.883672  4.811622  7.032087  6.666667  .255339
    578 47  340  -21.47059 1.252719  7.897116  7.883109 10.128531  11.93837  .130436
    590 47  267  16.853933 1.441546  7.940974   8.31572  8.656605 12.543275  .109833
    602 47  312  11.858974 1.557982  7.749717   7.78665  8.788355 11.909075  .042238
    614 47  349 -20.916906 1.722274  7.482656  7.803518  8.057606 12.029142  .049464
    626 47  276   5.797101 1.792227  4.106235  5.626554  4.918238  8.092859  .097717
    638 47  292  34.931507 1.719116  6.818205  4.267615  8.189889   6.38073  .264241
    650 47  394   7.106599 1.800124  8.771713  5.387497  8.939554  7.139047  .221983
    662 47  422  18.957346 2.116359 13.435088  4.018145  8.983652  4.790546   .30034
    674 47  502  439.64145 2.229569 16.359404  5.418282  11.09623  6.219253  .285759
    686 47 2709          . 2.838318 16.742103  5.058298 10.132857   5.61376  .234371
    566 52 1398 -11.659513  1.19543  3.015465  2.855703  3.228282  5.616266  .058429
    578 52 1235  -20.40486 1.142126  1.720216  1.531114  1.139721  3.251952  .069079
    590 52  983  10.986775 1.010925  1.634863  2.214764  -.974621  5.138229  .222226
    602 52 1091 -11.457378 1.243876   4.75019   4.14886  6.875073  9.546491   .19704
    614 52  966   2.795031 1.389219  2.939644  3.519476  3.564131  8.002209  .155744
    626 52  993 -1.9133937 1.362268  2.611215  1.633906   .229034  3.755668   .23263
    638 52  974   40.34908 1.327303  2.702712  1.549477    2.9969  3.367028  .226368
    650 52 1367  255.52304 1.545995  6.476875  3.692149  8.628429  7.636942   .16513
    662 52 4860   18.72428 1.680728 13.267041  4.462214 13.559366  8.209824  .076569
    674 52 5770  -6.412478 1.710202  9.434548  5.056776  8.108999  9.452977  .035831
    686 52 5400          . 2.256503 10.679837  6.868299   9.91531 11.990254  .168367
    575 53 1094   26.05119  .793872  2.110586  2.668882  3.090757  6.402753  .545063
    587 53 1379 -20.159536  .883601  2.102772  3.180952  2.835378  7.673539  .641773
    599 53 1101 -11.080835  .807093  2.094655  2.567284  4.025907   6.41941   .65895
    611 53  979  3.2686415  .931001  2.558923  2.907755  5.719384  7.358998  .638801
    623 53 1011  -4.846686  .928089  2.277566  3.074312  3.491358  7.846694  .542874
    635 53  962  12.162162 1.024344  2.610714  2.601571  4.555991  6.742506  .557213
    647 53 1079  37.905468 1.005138  1.831082  1.712005  4.579382  4.403466  .486218
    659 53 1488   83.80376 1.080233  3.131021  2.098067  4.827745  5.542222  .533207
    671 53 2735 -17.440584 1.073598  6.330419  2.628702   4.28304  7.091549  .391409
    end
    format %tmNN/CCYY date
    label values n_ticker n_ticker
    label def n_ticker 3 "1301", modify
    label def n_ticker 4 "1333", modify
    label def n_ticker 5 "1377", modify
    label def n_ticker 6 "1379", modify
    label def n_ticker 44 "2001", modify
    label def n_ticker 45 "2002", modify
    label def n_ticker 46 "2108", modify
    label def n_ticker 47 "2109", modify
    label def n_ticker 52 "2206", modify
    label def n_ticker 53 "2212", modify
    
    preserve
    local dv return
        drop `dv' date n_ticker close
        local iv
        foreach var of varlist *{
            local iv `iv' `var'
        }
        restore
    
    
        local idf
        local r2=0
    
        foreach var of local iv {
            xtreg `dv' `var'
            if( e(r2) > `r2'){
                local r2=e(r2)
                local idfn `var'
                local idf "xtreg `dv' `var'"
            }
        }
    
        preserve
        drop `dv' compnumber reportid currency countrycode region country fye sector sectnum web date ticker shortname n_ticker exchange close return `idfn'
        local iv
        foreach var of varlist *{
            local iv `iv' `var'
        }
        restore
        local iidf
        local r2=0
        foreach var of local iv {
            `idf' `var'
            if( e(r2) > `r2'){
                local r2=e(r2)
                local iidf "`idf' `var'"
            }
        }
    
    //as so on
    Last edited by Giles German; 04 Feb 2018, 09:24.

  • #2
    Giles:
    I would take a look at -xtreg-.
    That said, please note that -estat vif- is not included in -xtreg postestimation- suite.
    However, often on this list as well as in the literature (https://www.amazon.com/Course-Econom.../dp/0674175441) quasi-extreme multicollinearity is not seen as deserving an ad hoc test (taking a look at the width of confidence interval suffices for most precision purposes).
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Thanks Carlo

      so what should my currency be here high R2, low vif, high t stat or low p values?

      Keen to develop a philosophy then programme that in.

      Giles

      Comment


      • #4
        Thanks Carlo

        so what should my currency be here high R2, low vif, high t stat or low p values?

        Keen to develop a philosophy then programme that in.

        Giles

        Comment


        • #5
          Giles:
          under -xtreg- there's no an unique R-sq, but three:
          e(r2 w) R-squared for within model

          e(r2 o) R-squared for overall model

          e(r2 b) R-squared for between model

          If you have good reason to prefer -re- vs -fe- specification, you should consider -e(r2b).

          For the list of the stored resulst, see again -xtreg-.
          Kind regards,
          Carlo
          (Stata 19.0)

          Comment


          • #6
            Okay great thanks I will look to programme for e(r2b) I guess I cycle through the variables until I achieve an r2 as high as possible. I’m looking at 54 variables here. You say not to worry for near multicollinnearity is there anything else I should
            be watching out for. How many degrees of freedom
            should I be aiming for?

            Comment


            • #7
              Giles:
              there's no a hard and fast rule that I'm aware of for deciding how many degrees of freedom are ok for your purposes.
              That said, if you want to consider robust or clustered standard errors in your model, you may find -help j_robustsingular- interesting.
              Kind regards,
              Carlo
              (Stata 19.0)

              Comment


              • #8
                Okay Carlo. Also for normal regression I was using a hettest to manage for heteroskedasticity. How do I make sure this doesn’t happen for panel regression.

                Giles

                Comment


                • #9
                  Giles:
                  in panel data, heteroskedaticity of the distribution of _e residual can benefit from visual inspection. I'm not aware of any analytic test.
                  However, if you suspect heteroskedastiicty and/or autocorrelation, you can invoke robust (or cluster) options for your standard errors (under -xtreg- they do the same job).
                  Kind regards,
                  Carlo
                  (Stata 19.0)

                  Comment


                  • #10
                    I have written the following codewich is looking for the highest r2_b as discussed.... I am getting an error invalid syntax r198.


                    Code:
                      
                     * Example generated by -dataex-. To install: ssc install dataex clear input float date long n_ticker double close float return double(CurrentRatio EBITDAMargin OperatingMargin ReturnonEquity ReturnonInvestment TotalDebttoEquity) 566  3  263  -34.22053 1.153532  2.111555  1.776075  9.870937  5.660377 1.636456 578  3  173   8.092485 1.169349  1.552444  1.980144  8.282616  6.466227 1.362266 590  3  187   6.417112 1.163803  1.705138  2.020277   9.05641  6.837078 1.613801 602  3  199  -11.55779  1.16347  1.489251  1.495425  6.103352  4.629433 1.678207 614  3  176       12.5 1.118649   .398203   .975844   .330202  2.996226 2.375864 626  3  198   6.565657 1.103763   .876928   .899469  2.485019  2.677665 2.804256 638  3  211  24.170616 1.265924  1.236759  1.305281  7.168884  3.707988  2.29815 650  3  262   8.396947 1.536229  1.916625   1.44031 15.564529  4.736834 2.160753 662  3  284  -9.154929 1.503659  2.544997   1.12709 11.489965  3.720564  2.05678 674  3  258  1037.9845 1.724648  1.290673  1.074457  7.955425  3.434221 2.214412 686  3 2936          . 1.600863  1.755573  1.574647 10.186315  5.011671 2.046044 662  4 1694   23.90791 1.102925  3.251739  1.005691  8.697695  2.328596 3.257762 674  4 2099   60.55264 1.055705   3.36343  1.918263  4.706788  4.556272 3.302475 686  4 3370          . 1.233213  4.717306  3.012499 16.621113  7.115761 2.704151 568  5 1544   3.108808 4.520915   2.84545   .584946   .221084   .302538  .029713 580  5 1592 -17.273869 3.980719  2.620722  2.785319    .34145  1.466508  .034199 592  5 1317 -11.693242 4.261288 -1.439753  2.614927 -2.185296  1.459064  .045428 604  5 1163  -2.493551 5.392828  4.299411  3.600757  1.376361  2.213193  .035874 616  5 1134  -6.613757 5.803672  4.861656  6.854659   1.82866  4.343124  .026494 628  5 1059  24.551464 5.296215  5.558866  5.452456  2.667078  3.501435  .025559 640  5 1319  3.5633056 6.623578  9.020567  4.413812  4.158892  2.901829  .027469 652  5 1366    60.2489 5.534186  7.837246  6.837654  3.375596  4.481231  .044003 664  5 2189    20.3746 6.112259 13.121837  8.427531   4.53286  5.433738  .043286 676  5 2635  34.155598 5.912853 16.003947 12.451296  5.925328  7.918457  .056702 688  5 3535          . 4.951255 16.645107  12.45715  6.691592   8.02914  .044628 566  6 2090 -15.502393  .787745 16.114812 14.404148 10.685256 14.071102   .26994 578  6 1766  -3.227633  .653008 15.043641 16.689926 10.095966 15.560358  .260281 590  6 1709   12.17086  .516029 17.801174 19.038618 12.575282 18.486229   .20144 602  6 1917 -2.2430882  .631695 19.297597  19.26679 12.482496  19.17699   .16233 614  6 1874  -6.990395  .789411 13.699246 19.268045  7.500029  19.71175  .127317 626  6 1743   .4016064  .984651 15.954327 15.377595  9.665289 14.982499  .188781 638  6 1750  22.685715  .788124  7.322011  4.873766  3.434883  4.182513  .260177 650  6 2147  4.1918955  .637539  9.436212   7.46164  5.967268  7.213087  .327634 662  6 2237    -.49173   .61631 10.016571  6.500517  7.703049  6.138543  .361273 674  6 2226  -9.838275  .586048  7.480283   5.62087  5.201201  4.951217  .411017 686  6 2007          .  .620024  7.113547   5.43418  5.567587  4.511496  .579442 566 44  476 -14.705882  1.22925   3.58802  2.641761  3.636332  3.807879  .502146 578 44  406  -.9852217  1.03647  3.555774  2.674708  3.969598  4.595811  .540068 590 44  402  15.920398 1.172269  3.632626  3.859146  5.395953  7.665205  .575196 602 44  466 -18.240343 1.444558  5.209759  4.797275  8.586708  9.178143  .408585 614 44  381  -.2624672 1.261443  4.320633  4.190942  6.199939  7.593718  .454564 626 44  380   13.94737 1.340303   3.85776  3.580162  6.233597  6.630123  .450992 638 44  433   30.48499 1.161782  4.496641  3.583221   6.55931  6.431513  .379137 650 44  565   5.132743 1.403239  4.727125  3.764424  6.798162  6.821531  .379116 662 44  594   55.72391  1.46035  6.054383  2.816647  5.562771  4.912945  .349241 674 44  925   77.72973 1.410045  6.285058  3.560656  6.243522  6.237457  .352507 686 44 1644          . 1.443737  6.553181  3.699845  6.520192  6.449924   .27137 566 45 1199 -11.259383 2.654359  5.666802  4.587389  4.270904  6.455836  .032553 578 45 1064 -1.0338346 2.841104   5.27465  4.443822  4.205233    7.0551  .019531 590 45 1053  14.624882 3.105869  5.431878  4.661742  5.366735  8.294602  .012757 602 45 1207  -20.54681 3.164073  6.679317  5.989255  6.356786  9.912534  .011494 614 45  959   4.379562 2.622808  6.297683  5.973038  5.144374   9.08553    .0108 626 45 1001   27.77223 2.312688  5.710885  5.229623  4.674248  7.954516  .027214 638 45 1279 -11.336982 2.196232  5.308342  4.772086  4.559316  7.048852  .027397 650 45 1134   24.69136 2.393031  5.530216   4.49136  4.764084  6.829719  .030711 662 45 1414   26.52051 2.047434  7.457844   3.89209  4.635653  5.665133  .057314 674 45 1789  -7.154835 2.185334  7.932625  4.269797  4.741212  6.083679  .052487 686 45 1661          . 2.658912  8.788437  4.795128   5.07379  6.365007   .03736 566 46  341  -36.07038 2.421563  5.187278  5.221476  2.406475  3.665609  .197911 578 46  218  1.3761468 2.841258    4.8809  4.712845  2.547605  3.794369  .152743 590 46  221   3.167421 2.950077  4.577789  4.367336  2.702975  3.990655  .155009 602 46  228 -18.421053  2.55526   5.47549  5.456662  3.410523  5.083435   .18604 614 46  186 -1.0752689 2.912299  4.121053  4.018581  2.372148  3.713319  .157099 626 46  184 -3.2608695 2.944085  3.646823  3.346989  2.186235  3.060444  .151949 638 46  178   12.35955 3.089353  2.586399  2.383612  1.520383   2.18044  .148052 650 46  200         -1  3.02243  3.459841  3.087964   1.91917   2.73101  .141249 662 46  198 -3.5353534 2.789942  7.973364  3.624257  2.270784  2.960442  .155713 674 46  191   1139.267 2.789589  8.110959  3.479584  2.069594  2.650402  .151372 686 46 2367          . 2.809437  7.895687  3.750022  2.235074  2.804057  .145678 566 47  438  -22.37443 1.252314  4.883672  4.811622  7.032087  6.666667  .255339 578 47  340  -21.47059 1.252719  7.897116  7.883109 10.128531  11.93837  .130436 590 47  267  16.853933 1.441546  7.940974   8.31572  8.656605 12.543275  .109833 602 47  312  11.858974 1.557982  7.749717   7.78665  8.788355 11.909075  .042238 614 47  349 -20.916906 1.722274  7.482656  7.803518  8.057606 12.029142  .049464 626 47  276   5.797101 1.792227  4.106235  5.626554  4.918238  8.092859  .097717 638 47  292  34.931507 1.719116  6.818205  4.267615  8.189889   6.38073  .264241 650 47  394   7.106599 1.800124  8.771713  5.387497  8.939554  7.139047  .221983 662 47  422  18.957346 2.116359 13.435088  4.018145  8.983652  4.790546   .30034 674 47  502  439.64145 2.229569 16.359404  5.418282  11.09623  6.219253  .285759 686 47 2709          . 2.838318 16.742103  5.058298 10.132857   5.61376  .234371 566 52 1398 -11.659513  1.19543  3.015465  2.855703  3.228282  5.616266  .058429 578 52 1235  -20.40486 1.142126  1.720216  1.531114  1.139721  3.251952  .069079 590 52  983  10.986775 1.010925  1.634863  2.214764  -.974621  5.138229  .222226 602 52 1091 -11.457378 1.243876   4.75019   4.14886  6.875073  9.546491   .19704 614 52  966   2.795031 1.389219  2.939644  3.519476  3.564131  8.002209  .155744 626 52  993 -1.9133937 1.362268  2.611215  1.633906   .229034  3.755668   .23263 638 52  974   40.34908 1.327303  2.702712  1.549477    2.9969  3.367028  .226368 650 52 1367  255.52304 1.545995  6.476875  3.692149  8.628429  7.636942   .16513 662 52 4860   18.72428 1.680728 13.267041  4.462214 13.559366  8.209824  .076569 674 52 5770  -6.412478 1.710202  9.434548  5.056776  8.108999  9.452977  .035831 686 52 5400          . 2.256503 10.679837  6.868299   9.91531 11.990254  .168367 575 53 1094   26.05119  .793872  2.110586  2.668882  3.090757  6.402753  .545063 587 53 1379 -20.159536  .883601  2.102772  3.180952  2.835378  7.673539  .641773 599 53 1101 -11.080835  .807093  2.094655  2.567284  4.025907   6.41941   .65895 611 53  979  3.2686415  .931001  2.558923  2.907755  5.719384  7.358998  .638801 623 53 1011  -4.846686  .928089  2.277566  3.074312  3.491358  7.846694  .542874 635 53  962  12.162162 1.024344  2.610714  2.601571  4.555991  6.742506  .557213 647 53 1079  37.905468 1.005138  1.831082  1.712005  4.579382  4.403466  .486218 659 53 1488   83.80376 1.080233  3.131021  2.098067  4.827745  5.542222  .533207 671 53 2735 -17.440584 1.073598  6.330419  2.628702   4.28304  7.091549  .391409 end format %tmNN/CCYY date label values n_ticker n_ticker label def n_ticker 3 "1301", modify label def n_ticker 4 "1333", modify label def n_ticker 5 "1377", modify label def n_ticker 6 "1379", modify label def n_ticker 44 "2001", modify label def n_ticker 45 "2002", modify label def n_ticker 46 "2108", modify label def n_ticker 47 "2109", modify label def n_ticker 52 "2206", modify label def n_ticker 53 "2212", modify  
                      
                    
                    
                    local dv return
                    preserve
                    drop `dv' return close n_ticker date
                    local iv
                    foreach var of varlist *{
                            local iv `iv' `var'
                    }
                    restore
                    
                    local idf
                    local r2_b=0
                    
                    foreach var of local iv {
                            xtreg `dv' `var'
                            if( e(r2_b) > `r2'){
                                local r2=e(r2_b)
                                local idfn `var'
                                local idf "xtreg `dv' `var'"
                            }
                        }
                    Can you help please?
                    Last edited by Giles German; 04 Feb 2018, 12:00.

                    Comment


                    • #11
                      It is likely that your error message is arising from the -if( e(r2_b) > `r2') {- statement because local macro r2 has not been defined. Perhaps you men `r2_b' instead of `r2' here? Or maybe you are comparing to something else. Or maybe when you wrote -local r2_b=0- you meant to write -local r2 = 0-?

                      Also, the whole rigmarole between preserve and restore is unnecessary. All it does is generate a list of all of the variables in the data set other than return, close, n_ticker, and date. -preserve- and -restore- are time consuming operations because they thrash to the disk. So they should be used only when necessary. You can easily create this list with just two lines of code, without disturbing the data in memory.

                      Here's my best guess at what you want:

                      Code:
                      local dv return
                      
                      ds return close n_ticker date, not
                      local iv `r(varlist)'
                      
                      local idf
                      local r2 0
                      
                      foreach var of local iv {
                              xtreg `dv' `var'
                              if( e(r2_b) > `r2'){
                                  local r2=e(r2_b)
                                  local idfn `var'
                                  local idf "xtreg `dv' `var'"
                              }
                      }
                      Changes in italics.

                      One aside. The -xtreg- command you are using is for random effects regression. From your earlier posts I have the impression that you are working in finance. And while I disagree with it myself in some situations, I am aware that in finance there is a strong preference for fixed effects regressions. Since, ultimately, your work will need to convince others in your field, not me, I'm just pointing this out in case you overlooked it.

                      Comment


                      • #12
                        Thanks Clyde I always appreciate your input. You are right I am looking at panel regression in finance. I have written the following piece of code based on your code to continue adding variables until I get the highest r2. I will look to do this for five degrees of freedom.


                        Two questions firstly Firstly how do I add in fixed effects regression into the code as per your suggestion and secondly I have a number of degrees of freedom whichI a working up how do I decide which one to make the prediction off if the results are as follows: -

                        one degree of freedom R2 = 0.37
                        two degrees of freedom R2 = 0.35
                        three degrees of freedom R2 = 0.14
                        four degrees of freedom R2 = 0.07
                        five degrees of freedom R2 = 0.04


                        Code:
                        use "/Users/gilesgerman/Documents/stokpix/databasemodels/model1/mergent/2018/Q1/MERGENT2018Q1S3.dta", replace
                        
                        
                        xtset n_ticker date, monthly
                        keep if sector == "1"
                        drop if close == .
                        
                        
                        local sn 1    
                        local dv return
                            
                        missings dropvars `vars', force
                        ds return `dv' reporttype mic reportdate compnumber reportid currency countrycode region country fye sector sectnum web date ticker shortname n_ticker exchange close, not
                        
                        local iv `r(varlist)'
                        
                        local idf
                        local r2 0
                        
                        foreach var of local iv {
                            xtreg `dv' `var'
                            if( e(r2_b) > `r2'){
                                local r2=e(r2_b)
                                local idfn `var'
                                local idf "xtreg `dv' `var'"
                            }
                        }
                        
                        ds return `dv' reporttype mic reportdate compnumber reportid currency countrycode region country fye sector sectnum web date ticker shortname n_ticker exchange close `idfn', not
                        
                        local iv `r(varlist)'
                        
                        local idf
                        local r2 0
                        
                        foreach var of local iv {
                            `idf' `var'
                            if( e(r2_b) > `r2'){
                                local r2=e(r2_b)
                                local iidfn `var'
                                local iidf "`idf' `var'"
                            }
                        }
                        Last edited by Giles German; 04 Feb 2018, 13:53.

                        Comment


                        • #13
                          Firstly how do I add in fixed effects regression into the code as per your suggestion
                          If you read -help xtreg-, you will see that there are -re-, -fe-, and -be- options, of which -fe- corresponds to fixed effects. You will also see that the default is -re-, which is random effects.

                          I have a number of degrees of freedom whichI a working up how do I decide which one to make the prediction off if the results are as follows: -

                          one degree of freedom R2 = 0.37
                          two degrees of freedom R2 = 0.35
                          three degrees of freedom R2 = 0.14
                          four degrees of freedom R2 = 0.07
                          five degrees of freedom R2 = 0.04
                          Well, this is a bit like asking Catholic priest for advice on the best form of Satanic workship. (No offense to Catholics or Satanists intended; I simply want to pose a stark contrast.) It is really against my statistical "religion" to do "automagical" variable selection in models. While selection based on R2 offends my sensibilities less than selection on p-values, it is still the case that when you do this, nothing means what it appears to mean, and the danger of overfitting the model looms large. Probably less offensive to me still would be the use of AIC or BIC (which you can calculate by running -estat ic- after -xtreg-). So if a gun were held to my head and I were forced to do something like this I would run all five of these models, following each with -estat ic- and I would settle on the one with the smallest value of AIC or BIC. (They usually agree on this; when they don't I don't see a principled reason to prefer one's advice over the other and I might flip a coin.)

                          But that's my opinion, and others would give you different advice. The lack of consensus, in my view, speaks to the severe drawbacks associated with any statistic as the basis for selecting variables to include in a model. I come down strongly on the side of developing a theoretical model from external, prior information and then including all of those variables. If some of those variables turn out to have negligible impact on the model (which should not be confused with lack of statistical significance), then one could consider dropping those and sticking with what remains. Also, when the data are large enough to permit it, some type of validation of the model after development is really obligatory. How to do that is a very long chapter in its own right, and I won't go there.

                          I'll just content myself by assuming that you have orders to develop a model through some kind of automatic variable selection process but have the latitude to pick one. I've said my peace now. End of rant.

                          Comment


                          • #14
                            Okay Clyde so lets assume that we go for the first option (no guns being held to anyone's head though) looking for the lowest -estat ic- is there anyway I can incorporate the analysis in the code like I have with the R2 so stata drop some out the answer. My effort as follows assuming I have collected the regression equations in the macro as described above: -


                            Code:
                            local result `vdf' `ivdf' `iiidf' `iidf' `idf'
                                
                                foreach x of local result{
                                    `x', fe
                                    estat ic
                                    if (r(5)< `res'){
                                        local res=r(5)
                                        local final "`x'"
                                    }
                            }
                                    `final', fe
                            Last edited by Giles German; 05 Feb 2018, 10:43.

                            Comment


                            • #15
                              I think we are talking about two different things here.

                              What you are asking about in #14 is how to select a next variable based on least value of the AIC or BIC criterion. That would actually be no different at all from just selecting the next variable based on the highest R2. No point going to the trouble. If you're just selecting the single next variable, the information criteria offer no improvement over R2.

                              What I was talking about in #13, and which I thought you were asking about in #12 (but maybe I misunderstood) was that you already had been through the variable selection process and had 5 different models: one with a single predictor, one with two, one with three, one with four, and one with five, and now you wanted to choose the "best" among those 5 models. That's where the AIC or BIC criterion could be helpful. With only 5 models, it would be sufficient to simply display the AIC and BIC results and then you could do the selection by eye. If you have a series of models that is appreciably longer than 5, so that selection by eye is not practical, then it would go something like this:

                              [/code]
                              // ASSUME EACH MODEL'S VARIABLES ARE CONTAINED
                              // IN LOCAL MACROS model1 model2 model3, etc.
                              // LET'S SAY THERE ARE 25 SUCH MODELS

                              local selected
                              scalar lowest_aic = c(maxdouble)

                              forvalues i = 1/25 {
                              quietly xtreg `model`i'', fe
                              quietly estat ic
                              matrix S = r(S)
                              scalar aic = S[1, 5]
                              if aic < `lowest_aic' {
                              local selected `i'
                              scalar lowest_aic = aic
                              }
                              }
                              display `"Selected model is: `model`selected''"'
                              display "AIC = " lowest_aic
                              [/code]
                              Note: This uses the AIC. If you prefer the BIC make the obvious changes, and use S[1, 6] instead of S[1,5].

                              At this point, I don't understand if this is what you want or not.

                              Comment

                              Working...
                              X