Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Mohsin:
    I share Steve's post about the satisfying agreement you've reached with your supervisor.
    But I mildly disagree with your last statement:
    It's OLS regression and CINO is an independent variable there so that analysis is not going to suffer from overfit.
    , as overfit can happen with any regression methods.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #32
      Thank you for raising that issue. In my OLS model, I have 94 observations for 12 predictors, making the ratio of observations to predictors around 8. For the overall model, is that ratio not good? Or is the number of events of CINO still the overriding factor here?

      Comment


      • #33
        Moshin:
        to allow others to reply helpfully to your query, you should post what you tyoed and what Stata gave you back after -regress-. Thanks.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #34
          My bad with that - please find below what I typed and the associated results.


          Dependent variable is average sales growth:
          Code:
           regress avsg asg_1 cino avten pcoo avtmt pdc avri avhhi poc0 avlemp avtd avcacq, vce(robust)
          
          Linear regression                                      Number of obs =      94
                                                                 F( 12,    81) =    3.15
                                                                 Prob > F      =  0.0010
                                                                 R-squared     =  0.2397
                                                                 Root MSE      =   .1148
          
          ------------------------------------------------------------------------------
                       |               Robust
                  avsg |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
          -------------+----------------------------------------------------------------
                 asg_1 |   .0878071   .0751663     1.17   0.246    -.0617503    .2373644
                  cino |   .0447955   .0317535     1.41   0.162    -.0183839     .107975
                 avten |   .0038697   .0027134     1.43   0.158    -.0015291    .0092686
                  pcoo |  -.0292565    .042002    -0.70   0.488    -.1128272    .0543143
                 avtmt |  -.0021001   .0031878    -0.66   0.512    -.0084429    .0042426
                   pdc |  -.0230678   .0305177    -0.76   0.452    -.0837884    .0376529
                  avri |   .0189139   .0895013     0.21   0.833    -.1591656    .1969933
                 avhhi |   .0000202   .0000249     0.81   0.420    -.0000294    .0000698
                  poc0 |  -.0170332   .0307116    -0.55   0.581    -.0781398    .0440733
                avlemp |  -.0222431   .0109718    -2.03   0.046    -.0440736   -.0004127
                  avtd |  -.0257174   .0260674    -0.99   0.327    -.0775833    .0261485
                avcacq |  -.0163313   .0300955    -0.54   0.589    -.0762119    .0435493
                 _cons |   .0906293   .0431803     2.10   0.039     .0047141    .1765445
          ------------------------------------------------------------------------------
          Dependent variable is Tobin's Q: (in this model, total predictors are 13)
          Code:
          regress avtob avsg atob_1 cino avten pcoo avtmt pdc avri avhhi poc0 avlemp avtd avcacq, vce(robust)
          
          Linear regression                                      Number of obs =      94
                                                                 F( 13,    80) =   15.98
                                                                 Prob > F      =  0.0000
                                                                 R-squared     =  0.6128
                                                                 Root MSE      =  .71467
          
          ------------------------------------------------------------------------------
                       |               Robust
                 avtob |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
          -------------+----------------------------------------------------------------
                  avsg |  -.2090368    .580505    -0.36   0.720    -1.364279     .946205
                atob_1 |   .6045243   .0741142     8.16   0.000     .4570323    .7520163
                  cino |   .1195749   .2884703     0.41   0.680    -.4544993    .6936491
                 avten |   .0133284   .0127871     1.04   0.300    -.0121187    .0387755
                  pcoo |  -.4202749   .3534286    -1.19   0.238     -1.12362    .2830706
                 avtmt |   .0486927   .0288923     1.69   0.096    -.0088047    .1061902
                   pdc |  -.2247176   .1662624    -1.35   0.180    -.5555904    .1061552
                  avri |    .680766   .6940742     0.98   0.330    -.7004857    2.062018
                 avhhi |   .0003392   .0002451     1.38   0.170    -.0001486     .000827
                  poc0 |   .0521226   .1706209     0.31   0.761    -.2874239    .3916691
                avlemp |  -.0044892   .0525321    -0.09   0.932    -.1090315    .1000531
                  avtd |    -.41923   .1935504    -2.17   0.033    -.8044075   -.0340525
                avcacq |  -.2125567    .153599    -1.38   0.170    -.5182285    .0931151
                 _cons |  -.2677623    .313738    -0.85   0.396    -.8921207    .3565962
          ------------------------------------------------------------------------------
          I used robust option because for the first model of average sales growth, I get the following results:

          Code:
          estat hettest
          
          Breusch-Pagan / Cook-Weisberg test for heteroskedasticity 
                   Ho: Constant variance
                   Variables: fitted values of avsg
          
                   chi2(1)      =    17.13
                   Prob > chi2  =   0.0000
          For the second model, however I am not too sure if I should be using robust option:

          Code:
          estat hettest
          
          Breusch-Pagan / Cook-Weisberg test for heteroskedasticity 
                   Ho: Constant variance
                   Variables: fitted values of avtob
          
                   chi2(1)      =     4.30
                   Prob > chi2  =   0.0381
          Best,
          Mohsin

          Comment


          • #35
            Moshin:
            at its face-value, the second model should be preferred (and the heteroskedasticity should be dealth with -vce(robust).
            However, as only one predictor seems to boost its R2, I would investigate whether quasi-extreme multicollinearity is an issue via -estat vif- and estat vce, corr-
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment


            • #36
              Hi Carlo,

              sorry for the late reply. I was out of town due to some family emergency.

              Actually, in the second model, the dependent variable is changed, that is why the R2 jumps so much. Even though an additional variable is added in the model (apart from the change of dep. var), if the additional variable is removed, the R2 is still in .60+ range. If I understand it correctly, the variables explain a higher portion of variation in the changed depended variable (Tobin Q) that it does for Sales growth. I also ran the following tests:

              Code:
              estat vce, corr
              
              Correlation matrix of coefficients of regress model
              
                      e(V) |     avsg    atob_1      cino     avten      pcoo     avtmt       pdc      avri     avhhi 
              -------------+------------------------------------------------------------------------------------------
                      avsg |   1.0000                                                                                 
                    atob_1 |  -0.1808    1.0000                                                                       
                      cino |  -0.3142    0.0082    1.0000                                                             
                     avten |  -0.2696   -0.0988    0.2901    1.0000                                                   
                      pcoo |   0.5083   -0.0271   -0.3899   -0.4588    1.0000                                         
                     avtmt |   0.2675    0.0741   -0.4335   -0.1946    0.4896    1.0000                               
                       pdc |   0.3647   -0.1685   -0.2300   -0.3149    0.2487    0.2249    1.0000                     
                      avri |   0.0912   -0.2376   -0.0700   -0.2367   -0.0621   -0.1655    0.3104    1.0000           
                     avhhi |  -0.1886   -0.0279    0.2398    0.2517   -0.2237   -0.2139   -0.0839   -0.3124    1.0000 
                      poc0 |   0.0737    0.2082   -0.1253   -0.0804    0.3241    0.4273   -0.0516   -0.2206   -0.1353 
                    avlemp |   0.1416    0.0086    0.2054    0.2475   -0.0646   -0.2772   -0.2839   -0.1130   -0.3009 
                      avtd |  -0.0338    0.1698    0.1792    0.0238   -0.0979   -0.3080    0.1145    0.4982    0.1099 
                    avcacq |   0.0265   -0.0644    0.1416    0.0894   -0.0685   -0.2735    0.1068    0.3277    0.2464 
                     _cons |  -0.3029   -0.1226    0.1331   -0.2321   -0.3871   -0.6991   -0.2900    0.3199   -0.1972 
              
                      e(V) |     poc0    avlemp      avtd    avcacq     _cons 
              -------------+--------------------------------------------------
                      poc0 |   1.0000                                         
                    avlemp |   0.0053    1.0000                               
                      avtd |  -0.3333   -0.2235    1.0000                     
                    avcacq |  -0.2052   -0.0922    0.4547    1.0000           
                     _cons |  -0.3770    0.0559    0.0407   -0.0318    1.0000
              Which shows a correlation of greater than .5 for pcoo and avsg. However, the variance inflation factor for pcoo is 1.1, therefore multicollinearity is not be a problem:

              Code:
              estat vif
              
                  Variable |       VIF       1/VIF  
              -------------+----------------------
                    avlemp |      1.90    0.525198
                     avtmt |      1.69    0.591877
                       pdc |      1.55    0.646867
                     avten |      1.51    0.661710
                      avtd |      1.50    0.668493
                      avri |      1.44    0.694126
                     avhhi |      1.38    0.726808
                      avsg |      1.36    0.735523
                    atob_1 |      1.28    0.779924
                      cino |      1.24    0.807167
                    avcacq |      1.19    0.841955
                      pcoo |      1.10    0.909646
                      poc0 |      1.09    0.917384
              -------------+----------------------
                  Mean VIF |      1.40
              I hope I am correct with my interpretation of the results of correlation and VIF.

              Best,
              Mohsin

              Comment


              • #37
                Moshin:
                first of all I do hope that your family emergency was not that severe and it is now solved.
                I agree with your take: multicollinearity is not a problem with your model.
                Hence, I would go ahead with the second model of your post #34 (did you take the heteroskedasticity into account via -vce(cluster)-?).
                Kind regards,
                Carlo
                (Stata 19.0)

                Comment


                • #38
                  Carlo, thank you for your concern. Everything's good now

                  As for heteroskedasticity, I thought vce(robust) accounts for that. Is that not the case?

                  I tried using vce(cluster) but it gives me error:

                  Code:
                  regress avtob avsg atob_1 cino avten pcoo avtmt pdc avri avhhi poc0 avlemp avtd avcacq, vce(cluster)
                  invalid vce(cluster) option
                  r(198);

                  Comment


                  • #39
                    Mohsin:
                    if you don't have repeated observations on the same unit, vce(robust) is OK for taking heteroskedasticity into account. Sorry for my previous misguidance.
                    Kind regards,
                    Carlo
                    (Stata 19.0)

                    Comment


                    • #40
                      Exactly - I had the same feeling, because from my limited understanding, I thought vce(cluster) was for longitudinal data. But I couldn't dare say so because you're the expert here.

                      P.S: it shows that I am getting a grip on Stata and I should have faith in my understanding

                      Thank you so much Carlo. Statalist is full of wonderful people!

                      Best,
                      Mohsin

                      Comment


                      • #41
                        Moshin:
                        once Nick Cox wrote (in an English much better than mine)something in the line of: "We are all beginners. Some of us are only more experienced". I do hope I can feel this way forever and learn from the other Listers.
                        Kind regards,
                        Carlo
                        (Stata 19.0)

                        Comment


                        • #42
                          True that!

                          Comment

                          Working...
                          X