Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Mean difference per 1 SD increase

    Hello,

    I am trying to recreate results I found in an article I have been reading. The article says they did a linear regression and adjusted for some variables. I know how to run the regress command, but the results they present are "Mean difference per 1-SD increase (95% CI)". How do I find those results in my regression results? I am a beginner with STATA as well as with statistics so please let me know if you need clarification if my question does not make sense.

    Thanks in advance.

  • #2
    They presumably standardized their independent variable before running the regression, so it had a mean of 0 and a variance of 1. Then you can just read the CIs given by Stata as part of the normal output. People often frown upon standardizing variables, but it is still a common way to simplify interpretation of outcomes. To standardize your independent variable:
    Code:
    egen std_x1=std(x1)
    then run your regression as
    Code:
    reg y std_x1 x2 x3
    like usual.

    Comment


    • #3
      One caution about Ben's approach: if there are missing values for y, x2, or x3 in any observation, then the standardization will have been done on a different sample from the regression. So, to be sure you are consistent I would do it as:

      Code:
      regress y x1 x2 x3
      summarize x1 if e(sample)
      lincom _b[x1]*`r(sd)'
      If there are no missing values in y, x2, or x3, Ben's results and mine will be the same.

      However, one of the problems with using standardized variables is that people often inadvertently standardize in one sample and then analyze in another. Missing values are one source of the problem. This type of error is also particularly likely to occur if multiple subpopulations are analyzed: it is common to standardize in the combined group and then use those standardized variables in the subpopulation. And it is unlikely that this mistake would get noticed by the original articles' authors or reviewers or editors, unless it produced results that looked bizarre. So if you do it as I suggest and don't replicate the published results exactly, this could be why. (It's one of the reasons not to use standardized variables!)

      Comment


      • #4
        Thanks Ben and Clyde. I have no missing values so I went ahead and followed Ben's instructions. I attached a picture here to show how the article presented the results. Is the CI in the regress command in STATA the same CI the article presents?

        Comment


        • #5
          Yup, normally would be. However, it's strange to see a CI that does not contain the point estimate, so I don't know what exactly we're looking at. Normally a CI is the lower and upper range of the estimate. But in this case, huh, it's clearly not. .014 is not in the range .023-.025.

          My best guess is that .014 is the point estimate for the standardized version of "anger" whereas the CI is based on the raw (non-standardized) score. To get this is actually easier than using the standardized score. All you have to do is:
          Code:
          regress y x1 x2 x3, be
          . Note that this gives a one SD change in the IVs on a one SD change in the dependent ("fully standardized") so it's possible they did the "semi-standardized" approach I suggested, where at least the metric of y had been left alone.

          Not surprised to see it's a Psych journal -- since the scales for traits and a whole lot of what Psychologists study are arbitrary, making them even more abstract by standardizing them is a common thing to do.
          Last edited by ben earnhart; 07 Dec 2014, 08:32.

          Comment


          • #6
            Dear shrister, besides all these elegant advices you are given, here are some rules of this forum to follow:

            - The forum prefers original names (First and Surname). Please do consider the 'contact us' tab at the bottom right hand corner of this page and requests admin to change your name.

            -Please do take time to read the FAQ section. You are advised to share the source of the article you intend to discuss. Literally it would be lot easier to give opinion if you could source the paper. As Ben pointed out, that is a funny presentation of a point estimate which is outside the range of the CI yet showing a significant p value !!. Perhaps sharing the article would be a good step too.

            - And officially, it is Stata not STATA.

            Best,
            Roman

            Comment

            Working...
            X