Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bootstrap vs. Robust standard errors

    Hi guys,

    I have run a linear regression and my model suffers from heteroskedasticity that's obvious. (see the table and test)
    Code:
    reg HRQoL i.edu_cat i.smoking_behaviour blood_pressure i.bmi_categories cancer smoking_cancer COPD smoking_COPD diabetes blood_diabetes muskulo age gender age_gender marital_status age2
    estat hettest, rhs iid
    Normally what is done when heteroskedasticity occurs is add ,robust at the end of the model and the standard errors are taking count for the heteroskedasticity. I was wondering what is the best option when the model suffers from heteroskedasticity:
    1. add robust to the model and continue using this corrected model with the robust standard errors.
    2. bootstrap the regression (10000) times and use these model with the bootstrapped standard errors.

    On my search on the internet i did not find a satisfying answer on what to choose maybe you guys could help me out in which way to handle the heteroskedasticity?

    Thanks a lot!

    Florian
    Click image for larger version

Name:	standard errors.PNG
Views:	1
Size:	46.9 KB
ID:	1319981

  • #2
    Florian.
    I would go -robust-.
    Besides:
    - for interactions and squared terms you should exploit the capabilities of -fvvarlist- which, in turn, allows you to use wonderful postestimation commands such as -margins-.
    -have you searched for omitted variable bias via -estat ovtest-?
    - have you investigated quasi-multicollinearity via -estat vif-?
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Dear Florian,

      You can indeed use -robust- to get valid standard errors; using bootstrap standard errors does the same thing but takes more time. However, the very high level of heterskkedasticity suggests you can do better. One approach would be to use WLS, in his book Jeff Wooldridge suggests a simple way to do it. Another thing to consider is whether linear regression is the right approach to use here; how is your dependent variable defined?

      Best wishes,

      Joao

      Comment


      • #4
        Dear Florian,

        I also agree with Carlo and Joao. You can indeed use -robust- making the assumption that your data are independent but not identically distributed (heteroskedasticity). If you relax the asumption of independence of your data, then bootstrap standard errors might be more suitable.

        Best!

        Comment


        • #5
          Hello all,

          Thank you very much for your responses!
          @ Carlo: I did search for the omitted variable bias, the model suffers from omitted variables (P>F= 0.000) and for multicollinearity: the interaction effect correlated with the explanatory variables which is understandable, The rest of the VIF are not problematic (below 5)

          @ Jaoa: The research is focust on whether the effect of education on health related quality of life can be explained by other variables. The dependent variable is a scale of HRQoL states based on the SF-6D algorithm of Brazier (2004). I'm interested in how the addition of other explanatory variables influence the effect of education on HRQoL. I think the linear regression is most suitable for this?

          When i'm already asking question i'm so straight forward to ask another question:
          I'm doing this research for my master thesis in Health economics. My supervisor insists that I use the lasso analysis for a systematic approach for adding the variables in the model.. I know (did some research) that almost everybody is not happy with the lasso approach or at least does not encourage it..
          But in the worst case scenario, let's say I will do a LASSO approach using lars, (can be found with -findit lars-) and this is the result (when allowing for dummies= adding xi: in from of the lars code). How should i interpret this?! Exclude blood_diabetes for the model with the lowest Cp?
          Click image for larger version

Name:	Lars 1.PNG
Views:	1
Size:	26.8 KB
ID:	1320139

          When i DO NOT allow for dummies (normally for lars no factor variables and time-series operators not allowed) the results are the following and i should exclude no variabel at all!?!?
          Click image for larger version

Name:	lars 2.PNG
Views:	1
Size:	21.3 KB
ID:	1320140


          Thank you very much for the help!! Much appreciated.


          References:
          Brazier, J.E. & Roberts, J. 2004, "The estimation of a preference-based measure of health from the SF-12", Medical care, vol. 42, no. 9, pp. 851-859.

          Comment


          • #6
            From what I have seen in Wikipedia, your dependent variable is bounded between 0 and 100, possibly with many values at the boundaries. If that is right, a linear model will be far from ideal.

            Joao

            Comment


            • #7
              Florian:
              if your dependent variable is the EuroQol EQ VAS score (http://www.euroqol.org/), you can find some publications on related regression models via Google.
              I cannot say about -lasso-, but multicollinearity issues wil give you some problems in specifying the contribution of each predictors in explaining the variation of your depvar.
              I would think about a more parsimonious model.
              Kind regards,
              Carlo
              (Stata 19.0)

              Comment


              • #8
                Originally posted by Joao Santos Silva View Post
                Dear Florian,

                You can indeed use -robust- to get valid standard errors; using bootstrap standard errors does the same thing but takes more time. However, the very high level of heterskkedasticity suggests you can do better. One approach would be to use WLS, in his book Jeff Wooldridge suggests a simple way to do it. Another thing to consider is whether linear regression is the right approach to use here; how is your dependent variable defined?

                Best wishes,

                Joao
                Dear Joao,
                Can I ask you some questions ?
                What is different between robust and bootstrap?
                When I use xtabond2 command, they do not allow use bootstrap? So, Can I use robust instead?
                Thanks

                Comment


                • #9
                  Please check these concepts in a good textbook.

                  Best wishes,

                  Joao

                  Comment

                  Working...
                  X