Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Dear Professor Wooldridge,

    Thanks so much for your valuable input. I really appreciate it. If you don't mind, I have a follow up question. The x variables are the variables I have developed to measure different aspects of human capital and individual characteristics of job applicants to predict work outcomes pre-hire. They are not the conventional variables such as experience, education, or other survey constructs we usually use in these situations. They're all probability values and extremely right-skewed with about 75% of each x variable equal to 0. I converted them to factor variables on p75 cutoff, and tried the regression but the results are still the same. Also, with vce(robust) the results remain the same.
    My problem is that what drive my R-squared are the control variables, not the actual predictors. When predictors alone are in the regression, R-squared is 0.0165. My concern is that the x variables although significant do not really explain much variance in performance which probably makes the paper less interesting for publication. Is there anything else I can do about it? I know you and Carlo have kindly suggested some solutions, but the x variables still don't explain much of the variance.

    On a different note, I should say I LOVE your extremely helpful econometrics book. It has helped me a lot in my econometrics class and beyond

    Comment


    • #17
      Hi Monica:

      It's not really too surprising when certain key variables do not have strong explanatory power. It happens all the time in policy analysis. For example, if in intervention, such as participation in a job training program, is randomized, the R-squared from regressing some post-training outcome -- say, earnings -- on the job training indicator will produce a very small R-squared. But if you include past labor market and family background variables, you can typically do much better. You should usually include those other factors because doing so typically reduces the standard error. The point is that the small explanatory power of the key variables does not invalidate the estimates of the effect of these variables.

      Below is output that should be very similar to a well-known paper by Meyer, Viscusi, and Durbin (1995, American Economic Review). It's a difference-in-differences setup. You can see that the three dummy variables added to obtain the DD estimate -- the coefficient on the variable "afhigh" -- explain very little variation in the log of duration on workers' compensation. Adding a bunch of controls changes the estimate only modestly, but, more importantly, the R-squared goes from about 2% to about 19%. I believe this is similar to your application. There's nothing wrong. Sometimes the variables of main interest don't explain much variation in y, and that's just the way it is.

      This data set comes with my introductory book and is called INJURY.RAW. Thanks for the kind comment about my book!

      Code:
      . reg ldurat afchnge highearn afhigh if ky, robust
      
      Linear regression                               Number of obs     =      5,626
                                                      F(3, 5622)        =      38.97
                                                      Prob > F          =     0.0000
                                                      R-squared         =     0.0207
                                                      Root MSE          =     1.2692
      
      ------------------------------------------------------------------------------
                   |               Robust
            ldurat |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
           afchnge |   .0076573   .0440344     0.17   0.862     -.078667    .0939817
          highearn |   .2564785   .0473887     5.41   0.000     .1635785    .3493786
            afhigh |   .1906012    .068982     2.76   0.006     .0553699    .3258325
             _cons |   1.125615   .0296226    38.00   0.000     1.067544    1.183687
      ------------------------------------------------------------------------------
      
      . reg ldurat afchnge highearn afhigh male married hosp age lprewage i.indust i.injtype if ky, robust
      
      Linear regression                               Number of obs     =      5,347
                                                      F(17, 5329)       =      70.05
                                                      Prob > F          =     0.0000
                                                      R-squared         =     0.1899
                                                      Root MSE          =     1.1496
      
      ------------------------------------------------------------------------------
                   |               Robust
            ldurat |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
           afchnge |   .0495395   .0414964     1.19   0.233    -.0318104    .1308894
          highearn |  -.1517808   .0870894    -1.74   0.081    -.3225116    .0189501
            afhigh |   .1687213   .0639065     2.64   0.008     .0434384    .2940043
              male |  -.0842888   .0422399    -2.00   0.046    -.1670962   -.0014813
           married |   .0566623   .0364226     1.56   0.120    -.0147408    .1280655
              hosp |   1.130493   .0377218    29.97   0.000     1.056543    1.204443
               age |    .006507   .0013367     4.87   0.000     .0038864    .0091275
          lprewage |   .2844806   .0794226     3.58   0.000     .1287799    .4401814
                   |
            indust |
                2  |   .1838642   .0524272     3.51   0.000     .0810855    .2866429
                3  |   .1634853   .0376996     4.34   0.000     .0895787     .237392
                   |
           injtype |
                2  |   .9354675    .150684     6.21   0.000     .6400653     1.23087
                3  |   .6354659   .0904355     7.03   0.000     .4581753    .8127566
                4  |   .5545499   .0970524     5.71   0.000     .3642875    .7448123
                5  |   .6412013   .0923297     6.94   0.000     .4601972    .8222054
                6  |   .6150411   .0911014     6.75   0.000     .4364452    .7936371
                7  |   .9913358   .2330889     4.25   0.000     .5343861    1.448285
                8  |   .4340821    .130214     3.33   0.001     .1788094    .6893547
                   |
             _cons |  -1.528202   .4156698    -3.68   0.000    -2.343085   -.7133189
      ------------------------------------------------------------------------------

      Comment


      • #18
        Thank you so much for your reassuring and extremely helpful explanation. It makes total sense. I really appreciate your help.

        Best Regards,
        Monica

        Comment

        Working...
        X