Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • standardized coefficient with interactions

    Hey there Statalisters!
    I'm interested in quantifying the total effect of an independent variable on the dependent inclusive of an interaction effect. The model I have looks like (they're all continuous):

    xi: reg depedent income edu edu##inst i.country i.year

    More precisely, I'd like to say a one standard dev increase in the edu variable is associated with a increase in the dependent variable of x units assuming some value for the variable inst (so adding the effect coming from edu and from edu## inst).

    I've done enough research on interaction terms and standardizing the variables in this and other sites that I no longer trust using ", beta". Is it OK to run the following regression to obtain the coefficients I need?

    xi: reg dependent stdincome stdedu stdedu##inst i.country i.year
    where:
    egen stdincome=std(income)
    egen stdedu =std(edu)

    I'm not standardizing the dependent or the interaction variable inst because I want to express the changes in the dependent variable in the original units & I want to assume certain values for inst so I don't see the need to standardize it.

    Thank you!
    Adela

  • #2
    You are, I think, basically on the right track. Let me suggest a few refinements.

    Since you are (partly) using the ## notation of Stata factor variables, you are using a modern version of Stata. So ditch the xi: prefixes. They do nothing for you except add confusing variables to your data set that serve no purpose.

    Attention to the centering of variables participating in interaction terms is always a good idea, and by centering them at the mean (which is implicit in standardization) you assure that the "main effects" are meaningful, i.e. conditional on the other variable being at its mean value.

    In
    xi: reg dependent stdincome stdedu stdedu##inst i.country i.year
    where:
    egen stdincome=std(income)
    egen stdedu =std(edu)
    I see a few problems. Apart from the unneeded xi: prefix, by using stdedu##inst, you are telling Stata to treat stdedu as a discrete variable taking on non-negative integer values. Even if edu is in fact a discrete variable, its standardized version is unlikely to have exclusively non-negative integer values. More likely, since you even consider standardizing it, edu is a continous variable (or at least you want to treat it as one). So I think this should be:

    Code:
    regress dependent stdincome c.stdedu##inst i.country i.year
    Explanation: with the ## notation, you don't need to separately specify the main effect of either variable in the interaction--it comes automatically. The c. prefix on stdedu tells Stata that this is a continuous variable: in interactions variables are assumed to be non-negative integer valued by default, so you need the c. to override that default.

    You are correct in not standardizing the interaction term--doing so would completely distort its meaning and give you results that are very difficult to work with.

    Also, "the total effect of an independent variable... inclusive of an interaction effect" is more or less a contradiction in terms. By interacting education (standardized or otherwise) with variable inst, you are explicitly stating that there is no single effect of education, but rather that there are different effects, each conditional on the value of inst. If what you mean is that ultimately you will do an omnibus test of the null hypothesis that the effect of education is zero at all levels of inst, then that can be done with a joint test of the main effect of stdedu and the interaction term(s) it participates in. But more typically, people are interested in the specific effects at each level of inst. While you can calculate those by hand using -lincom-, it is easier and less error-prone to get this information from the -margins- command.

    Comment


    • #3
      Thanks for the response, Clyde! I will fix the xi and the double mentioning of the edu variable. I actually did have edu and inst with the c. prefix but completely ignored them when I typed up the code. Does this final code look fine to you?
      regress dependent stdincome c.stdedu##c.inst i.country i.year Please confirm.



      Also, I lack confidence in my results mainly because I thought that the coefficients obtained from a regression where all the variables were standardized, say:

      regress stddepenent stdincome stdedu

      would be equal to the standardize coefficients (the betas) obtained from

      regress depedent income edu, beta

      Please explain why they do not match. Thanks!

      Comment


      • #4
        Code:
        regress dependent stdincome c.stdedu##c.inst i.country i.year Please confirm.
        Yes, this looks right, assuming that inst is also a continuous variable. If it's discrete, then it should be c.stedu##i.inst.

        The lack of correspondence between the results of -regress stddependent stdincome stdedu- and -regress dependent income edu, beta- is likely due to differences in the sample over which standardization is carried out. In the -regress, beta- version, only observations for which none of the variables in the model have missing values are used in standardizing any of the variables. By contrast, in the -regress stddependent stdincome stdedu- version you have standardized dependent using all observations for which dependent is not missing, regardless of whether income and edu are missing. Similarly for the other variables there. This is one of the many reasons that standardized regressions are almost invariably confusing. That they continue to be widely used in some fields remains a mystery to me. I avoid them like the plague. If you don't have compelling reasons for standardizing your interacting variables, I would recommend not doing so, and just centering them at their means, and probably leave everything else the way it is.



        Comment


        • #5
          Glad I asked! I will not standardize them. Thanks again!

          Comment


          • #6
            Originally posted by Clyde Schechter View Post
            Code:
            regress dependent stdincome c.stdedu##c.inst i.country i.year Please confirm.
            Yes, this looks right, assuming that inst is also a continuous variable. If it's discrete, then it should be c.stedu##i.inst.

            The lack of correspondence between the results of -regress stddependent stdincome stdedu- and -regress dependent income edu, beta- is likely due to differences in the sample over which standardization is carried out. In the -regress, beta- version, only observations for which none of the variables in the model have missing values are used in standardizing any of the variables. By contrast, in the -regress stddependent stdincome stdedu- version you have standardized dependent using all observations for which dependent is not missing, regardless of whether income and edu are missing. Similarly for the other variables there. This is one of the many reasons that standardized regressions are almost invariably confusing. That they continue to be widely used in some fields remains a mystery to me. I avoid them like the plague. If you don't have compelling reasons for standardizing your interacting variables, I would recommend not doing so, and just centering them at their means, and probably leave everything else the way it is.
            Hello Clyde, I want to confirm with you that the following specification is problematic (i.e., interact two standardized variables together), is that correct? What you recommended is to only standardize one of the variables within the interaction term, am I right? Thanks!

            Code:
            regress dependent c.std_income##c.std_edu i.country i.year Please confirm.

            Comment

            Working...
            X