Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Regression

    Hi guys,

    I am not too experienced with statistics, but am conducting some quantitative analysis for my undergrad Psychology dissertation and would like a bit of help please.

    I ran a linear regression to see if social support level (a binary variable - either low or high) could predict Total Difficulties Score (a continuous variable). But, when running my assumptions test, the assumption of homoscedasticity was violated. Therefore, I did some research and found that one way of overcoming this problem is by log transforming the dependent variable. So, I log transformed Total difficulties score, creating a new variable called log_totaldifficultiesscore. I then reran the regression and the assumption was no longer violated, so the problem was overcome. BUT, I am know unsure as to HOW to interpret the coefficients, as it is no longer RAW scores being discussed in the regression but LOG TRANSFORMED SCORES. So, what would a coefficient of -.370 actually mean? Or, how can I can I 'un-log transform' the coefficients??

    I hope my explanation makes sense and someone can help, I have been struggling with this for a few days now!

    Thank you

  • #2
    Alex:
    1) a simple OLS is hardly enough to give a fair and true view of the data generating process you're investigating. In all likelihood, your predictor is statistically significant (that simply is a matter of fact) but it also includes the effect of other independent variabes that you, as per your description, omitted to plug in the right-hand side of your regression equation;
    2) if you detected heteroskedasticity only (but I guess that main issue with your model rests on its misspecification; see -linktest-), you can simply invoke -robust- standard errors without logging the regressand (unless your implicit goal/angle is a log-linear regression):
    3) as per FAQ, please post what you typed nd what Stata gave you back and share an excerpt/example of your dataset via -dataex-. Thanks.
    Last edited by Carlo Lazzaro; 16 Mar 2023, 04:13.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Hello,
      Thank you for your quick reply - that is very helpful!
      Unfortunately I am very limited as to what information I can share from my STATA, as all analysis occurs within a secure hub as it is part of a national UK dataset. It is therefore not possible to copy and paste or screenshot from STATA, which makes things challenging!

      I have tried using the robust command:
      regress sebdtot i.suppsc2 i.ChldSx i.ltillness i.Bullyexp i.tenure, vce(robust)

      When I perform this regression, then run estat imtest, heteroskedasticity is p=0.0009.

      So, adding the robust command has not overcome the issue.
      Have I done this correctly?
      Any more guidance would be incredibly helpful!
      Thanks so much
      Alex

      Comment


      • #4
        Alex:
        1) whenever you deal with a confidential dataset, you could tackle the issue notwithstanding by changing the name of the variables (Spiderman; Dare Devil, Alfa; Beta and so on and so forth) and follows my previous (full of typos9 ) suggestion #3:
        2) this is a common cobblestone I stumbled upon many times years ago (and Statalist put me on the right track then): the -robust- option affects the standard errors, not the residual; that's why -estat imtest- (or -estat hettest) will complain about heteroskedasticity ever after. The -robust- command did fix the issue; is the test that should not be repeated;
        3) unsolicited advice: check via -linkedin- the correctness of the functional form of the regressand (that, if misspecified, bites way harder than heteroskedasticity).
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Again, thank you so much for your help!
          Just to clarify and make sure I have got my head around it.
          The robust command has resulted in a change to the standard errors and NOT my residuals. Because the residuals have not changed, the data is still heteroskedastic. However, the change in standard errors as a result of the robust command means the linear regression can be appropriately run, despite the heteroscedasticity.
          Please let me know if that is correct. I am very much a beginner with stats and also very new to STATA so struggling to get my head around things.
          Thanks again
          Alex

          Comment


          • #6
            The vce(robust) option (not a command) does not change the coefficient estimates, but only the reported standard errors. So fitted and residuals are unchanged, as you say. That is the intention.

            I would judge heteroscedasticity here graphically, not by a test, say by looking at rvfplot after a regression.
            Last edited by Nick Cox; 16 Mar 2023, 07:28.

            Comment


            • #7
              Hello Nick, thank you for your reply!
              That makes a lot of sense.
              So, with reference to my initial post, I will not bother creating a log-variable for Total Difficulties Score. Instead, I shall run the regression with Total Difficulties Score and add the vce(robust) option to account for the heteroscedasticity - do you think this sounds like a good idea?

              Also, if you don't mind me asking just out of curiosity, what is the difference between an option and a command in STATA?
              Thanks so much

              Comment


              • #8
                Sorry, no; I can't advise you without seeing the data whether transforming or not transforming or indeed using a model with log link is best for your project. Considerations include whether the score is bounded and so predictions out of range are a risk,

                The curiosity is expected and welcome, but you can find out yourself by reading e.g.

                Code:
                help language
                and please look at https://www.statalist.org/forums/help#spelling

                Comment

                Working...
                X