Technical/conceputal question about calculating Cohen's d after -margins'.

David Speed

Join Date: May 2015

Posts: 98
#1

Technical/conceputal question about calculating Cohen's d after -margins'.

17 Dec 2016, 06:25

Hello Everyone,

We are conducting research that is exploring the difference in recovery trajectories for persons with schizophrenia and persons with bipolar disorder. Over two years, we gathered longitudinal data and then analyzed those differences using -mixed- (using Stata 13). As we hypothesized, there was a main effect for group differences (i.schzbip = schizophrenia/bipolar disorder). As part of the analyses we ran post-estimation follow-up with a -margins- command (syntax below).

mixed gas age-sedhyp c.time alcab canab i.schzbip || _all: r.time || id:
margins i.schzbip, atmeans

--------------------------------------------------------------------------------
| Delta-method
| Margin Std. Err. z P>|z| [95% Conf. Interval]
---------------+----------------------------------------------------------------
schzbip |
schizophrenia | 51.60243 3.198098 16.14 0.000 45.33427 57.87059
Bipolar 1 & 2 | 58.45254 3.3876 17.25 0.000 51.81296 65.09211
--------------------------------------------------------------------------------

I want to describe the magnitude of the difference between the groups (i.e., effect size), and started computing Cohen's d. However, when I started going through the process I realized that I wasn't calculating the raw differences between groups, I was calculating group differences while many other variables were being controlled for. This raised several questions for me:

1. Is Cohen's d still valid as an index of effect size when controlling for other variables?
2. If yes, how does one determine SD post-estimation? Converting SE to SD is straightforward if you have the N, but Stata uses a computation that produces the delta-method SE, and despite reading over it's calculation, I'm unsure of how I can get SD from it.
3. On a slightly different tangent, why does Stata provide z/p values for the marginal mean estimates? I understand how they're calculated (margin/SE = z), and I'm guessing Stata is providing them for a reason, but I can't figure out what that reason is.

Thanks a lot,

David.

Please note that Elc Estrera asked a variation of this question previously in an untagged post, but there wasn't any feedback on it.
Tags: Cohen's d, margins, mixed
Clyde Schechter

Join Date: Apr 2014

Posts: 30103
#2

17 Dec 2016, 09:36

So, the idea behind Cohen's d is to give the between-group difference a scale-free interpretation. It is somewhat analogous to the use of standardized variables (except that one does not standardize the group indicator variable.) In principle, there is no reason why this wouldn't work equally well with adjusted group differences. The trick, as you have noted, is to find an appropriate scale factor that is analogous to the standard deviation. I think the complexity of doing that is the reason that you seldom see this done. As you note, there is no simple relationship between the standard error of the adjusted predictions and the standard deviation. You could calculate it yourself by, in effect, hand-coding what -margins- does internally. That is, something along these lines (not tested and may be riddled with typos as I have yet to get my morning caffeine):

Code:

// PRESERVE ORIGINAL DATA // AND SET EACH VARIABLE TO ITS SAMPLE MEAN foreach v of varlist age-sedhyp time alcab canab { gen _`v' = `v' // PRESERVE ORIGINAL VALUES summ `v' if e(sample), meanonly replace `v' = r(mean) // SET REGRESSORS TO SAMPLE MEANS } gen _schizbip = schizbip // PRESERVE VALUES OF schizbip // CALCULATE ADJUSTED PREDICTIONS // NOTE: ASSUMIE schizbip IS 0/1 CODED // ADJUST ACCORDINGLY IF NOT forvalues i= 0/1 { replace schizbip = `i' predict outcome`i' } quietly ttest outcome0 = outcome1 display "Scaled group difference = " %2.1f =(r(mu_2)-r(mu_1))/sqrt(r(sd_1)^2 + r(sd_2)^2) // RESTORE ORIGINAL DATA VALUES foreach v of varlist age-sedhyp time alcab canab schizbip { replace `v' = _`v' drop _`v' }

This would be analogous to Cohen's d for adjusted values. It would probably be reasonable to classify small, medium, and large differences in the same way you would with Cohen's d for crude differences.

All of that said, I don't know what your outcome variable gas is, but I'm guessing it is some psychiatric symptom scale. If the scale is one that is widely known in the field, and one for which clinicians can readily appraise whether a difference between 58.4 and 51.6 is a substantive difference or not, then why bother with a Cohen's d-like transformation? Really it would just be obfuscatory. On the other hand, if this is some esoteric measure that clinicians (or whoever the audience for your research is) are not generally familiar with, then a standardizing approach adds clarity.

As for your third question, I don't really know why Stata provides the z/p values for these marginal estimates. They are useful only for testing the null hypothesis that the corresponding adjusted estimate is zero, which is hardly ever a hypothesis of interest. By contrast, for marginal effects, these hypotheses are often of great interest. I suspect it was simpler to program -margins- to give z/p all the time than to program in a decision about when it is and isn't relevant. And, then, too, there is the rare situation where somebody really does want to test whether the adjusted estimate is zero. So, in those unusual circumstances it will prove useful. And in normal circumstances it doesn't hurt to provide the extra useless statistics so long as the user is aware that they are best just ignored and doesn't waste time puzzling over them.
1 like
Comment

Clyde Schechter

Join Date: Apr 2014
Posts: 30103

17 Dec 2016, 11:26

On reflection , my calculation of a Cohen d equivalent in #2 is incorrect. Also, all of that storage of values is unnecessarily cumbersome--it is simpler to just -preserve- the data

Code:

foreach v of varlist age-sedhyp time alcab canab {
    summ `v', meanonly
    local `v'_mean = r(mean)
}
preserve
foreach v of varlist age-sedhyp time alcab canab {
    replace `v' = ``v'_mean'
}
forvalues i = 0/1 {
    replace schizbip = `i'
    predict outcome`i'
}
keep schizbip outcome0 outcome1
gen obs_no = _n
reshape long outcome, i(obs_no) j(_j)
cohend outcome _j
restore

To use this version you need to install -cohend- from SSC if you don't already have it.

Comment

Clyde Schechter

Join Date: Apr 2014
Posts: 30103

17 Dec 2016, 13:20

And, in #3, I forgot to restrict things to the estimation sample. So it should be:

Code:

foreach v of varlist age-sedhyp time alcab canab {
    summ `v' if e(sample), meanonly
    local `v'_mean = r(mean)
}
preserve
keep if e(sample)
foreach v of varlist age-sedhyp time alcab canab {
    replace `v' = ``v'_mean'
}
forvalues i = 0/1 {
    replace schizbip = `i'
    predict outcome`i'
}
keep schizbip outcome0 outcome1
gen obs_no = _n
reshape long outcome, i(obs_no) j(_j)
cohend outcome _j
restore

Comment

David Speed

Join Date: May 2015

Posts: 98
#5

17 Dec 2016, 17:50

NOTE: I wrote this prior to seeing Clyde Schechter's addenda.

Hi Clyde,

Thank-you for your in-depth response, I was blown away by its detail. It took me some time today going through what you wrote, partially because my formal education only taught SPSS (I've picked up Stata independently because, I find Stata to be a better stats program). Consequently though, my knowledge of Stata is... uneven(?) so I may make very novice mistakes because of it. If so, I apologize because I can imagine it's frustrating.

With those caveats, I couldn't get your complete syntax to work, but I made slight modification to your syntax and it seems to work fine.
----------------------------------------------------------

Here's what I did, I've bolded text to note something.

foreach v of varlist age-sedhyp time alcab canab {
2. gen d`v'=`v'
3. qui sum d`v'
4. replace d`v'=d`v'-r(mean)// You originally wrote "replace `v' = r(mean)", but I think that this made all values constant(?)
5. }

qui mixed pospanss dage dsex dcoc dstim dnarc dhall dsedhyp dtime dalcab dcanab schzbip || _all: r.time || id:

forvalues i = 0/1 {
2. replace schzbip=`i'
3. predict outcome`i'
4. }

. ttest outcome0=outcome1

Paired t test
------------------------------------------------------------------------------
Variable | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
---------+--------------------------------------------------------------------
outcome0 | 651 15.65405 .1824384 4.65486 15.29581 16.01229
outcome1 | 651 14.94262 .1824384 4.65486 14.58438 15.30086
---------+--------------------------------------------------------------------
diff | 651 .7114311 2.50e-08 6.37e-07 .7114311 .7114312
------------------------------------------------------------------------------
mean(diff) = mean(outcome0 - outcome1) t = 2.9e+07
Ho: mean(diff) = 0 degrees of freedom = 650

Ha: mean(diff) < 0 Ha: mean(diff) != 0 Ha: mean(diff) > 0
Pr(T < t) = 1.0000 Pr(|T| > |t|) = 0.0000 Pr(T > t) = 0.0000

// The SE of diff is exceptionally low, why is this?

. display "Scaled group difference = " %2.1f =(r(mu_2)-r(mu_1))/sqrt((r(sd_1)^2 + r(sd_2)^2/2))

// I added the "/2" at the end of the expression - is my addition correct?

Scaled group difference = -0.1

The Cohen's d value the output generates across the different DVs makes a lot of sense. In this particular case, schzbip was non-significant.

Are the modifications that I made look reasonable?

Last edited by David Speed; 17 Dec 2016, 18:02.
Comment
David Speed

Join Date: May 2015

Posts: 98
#6

17 Dec 2016, 17:51

Ha - I should have refreshed before I responded.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30103
#7

17 Dec 2016, 18:28

OK, not sure where you stand after #6. Some of what you wrote about in #5 is changed in #4. But I just want to comment about:

Code:

foreach v of varlist age-sedhyp time alcab canab { 2. gen d`v'=`v' 3. qui sum d`v' 4. replace d`v'=d`v'-r(mean)// You originally wrote "replace `v' = r(mean)", but I think that this made all values constant(?) 5. }

No, I did mean -replace _`v' = r(mean)-, and yes it does make each variable into a constant. This code was intended to be run after the regression command. The code is intended to replicate what -margins- does behind the scenes. In particular, when you specify the -atmeans- option in -margins-, all variables not otherwise covered are replaced by their mean values, for the calculation of the predicted values that then are used to calculate adjusted means.

Anyway, I think if you run the code in #4 after your -mixed- command, you will get what you want.
1 like
Comment
David Speed

Join Date: May 2015

Posts: 98
#8

18 Dec 2016, 17:54

Apologies Clyde, I misunderstood. I will run the code once I get back to my office. Thanks again!
Comment
David Speed

Join Date: May 2015

Posts: 98
#9

28 Dec 2016, 08:43

@ Clyde RE: #7
---

Hi Clyde,

I had asked my colleague about the utility of using Cohen's d to describe clinical change (instead of the scale itself), and they agreed that Cohen's d might be unnecessary. But, I'd still like to figure out how to calculate it because I think it may be useful in other arenas of research. I had a question regarding the syntax; is it calculating the effect size for a specific dependent variable? If so, where is that reflected in the syntax? I'm sorry if this question is basic, but I like understanding the nuts and bolts of this stuff.

Cheers,

David.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30103
#10

28 Dec 2016, 09:22

The entire process begins with your running a regression -mixed gas age-sedhyp c.time alcab canab i.schzbip || _all: r.time || id:-. The dependent variable is gas.

Then comes the code in #4 which begins by calculating the expected values of gas conditional on schzbip = 0 and 1, adjusted for the other variables in your model, and stores them in new variables outcome0 and outcome1, respectively. I could have called them something like adjusted_gas0 and adjusted_gas1 instead, and it might have been clearer. Then it goes on to -reshape- the data long, so as to stack the values of outcome0 and outcome1 into a single variable, called outcome, with the distinction of which is which now being carried by the new variable _j (which -reshape- creates). The final step is the -cohend- command using this variable "outcome" (which is the name I used for the adjusted expected values of gas). So -cohend- is being used to estimate the size of the schizbip effect on the adjusted expected values of gas.

More generally, if you were to use this code following some other regression with a different outcome variable instead of gas, the result would be the Cohen d for the size of the schizbip effect on the adjusted expected values of whatever the different outcome variable was.
Comment
David Speed

Join Date: May 2015

Posts: 98
#11

21 Jan 2017, 07:05

Hi Clyde - sorry about the delay in getting back. I referenced this post to one of my colleagues and am getting him to run the syntax as I won't have access to the data for awhile.

On a related note, my colleague pointed out that if we standardized the outcome variable, any coefficient value for a dummy coded group variable would be the equivalent of Cohen's d. Apparently, this is because the coefficient would be describing the change in fractions of SD, which is all Cohen's d actually is. This seemed accurate to me, but the solution is almost too simple, which makes us suspicious. Is this accurate, or are my colleague and myself missing something?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30103
#12

21 Jan 2017, 10:26

On a related note, my colleague pointed out that if we standardized the outcome variable, any coefficient value for a dummy coded group variable would be the equivalent of Cohen's d. Apparently, this is because the coefficient would be describing the change in fractions of SD, which is all Cohen's d actually is. This seemed accurate to me, but the solution is almost too simple, which makes us suspicious. Is this accurate, or are my colleague and myself missing something?

This would be correct if you have only one predictor in the regression. Then it's a simple way of calculating ordinary Cohen's d.

But I don't think it does quite what you want when there are multiple variables. I think that the standardized adjusted difference (what my code calculates) may not be the same as the adjusted standardized difference (what your colleague's suggestion calculates.) But, honestly, I'm not sure. Why don't you try it both ways and see if you get the same results.
Comment
Weiwen Ng

Join Date: Jun 2015

Posts: 1241
#13

21 Jan 2017, 11:54

Originally posted by David Speed View Post

Hi Clyde - sorry about the delay in getting back. I referenced this post to one of my colleagues and am getting him to run the syntax as I won't have access to the data for awhile.

On a related note, my colleague pointed out that if we standardized the outcome variable, any coefficient value for a dummy coded group variable would be the equivalent of Cohen's d. Apparently, this is because the coefficient would be describing the change in fractions of SD, which is all Cohen's d actually is. This seemed accurate to me, but the solution is almost too simple, which makes us suspicious. Is this accurate, or are my colleague and myself missing something?

David and Clyde,

I wish I had seen this post earlier, but I was in finals at the time. This is a very interesting discussion.

During a statistics class last term, I ran a similar question by my biostats professor. He said that it seemed reasonable that you could interpret an adjusted standardized difference from a regression model in the same framework as a standardized mean difference from an RCT. After all, in an RCT, you can fit a regression model and adjust only for group membership, then standardize that difference. In a mixed model like David is fitting, you're adjusting the difference for a bunch of other characteristics, but you still have a between-group difference. You should be able to standardize that mean difference just like a mean difference from an RCT. So no, I think it's not too simple.

Clyde's code was confusing me, then I read the bit about the standardized adjusted difference vs adjusted standardized. Clyde, I believe that what people interested in effect sizes would be looking for is an adjusted SMD; standardizing the rest of the variables in the regression would not matter as much to us, I think.

David, I have one technical question about how you coded the command. You ran:

Code:

mixed gas age-sedhyp c.time alcab canab i.schzbip || _all: r.time || id:

From what I understand, this implies that there's a crossed random effect. I have little experience with crossed random effects. But it looks like subjects are crossed within units of time. I'm just going to assume that you made yearly measures in this discussion. I think that in this model, I think you're assuming that in each year, there was a systematic and random factor that applies equally to all individuals in that year. In the example Stata gave that appears to correspond to your formulation, they were doing pig weights; in each week of measurement, something systematic but random like weather or feeding patterns could have applied equally to all pigs.

The code I would normally think of for this problem as I understood it is:

Code:

mixed gas age-sedhyp c.time alcab canab i.schzbip || id: time

Hence, observations are nested within individuals. The model has a population average effect of time on GAS, but it allows each person to have a different slope.

I think this corresponds to example 11 (pg 32- 35 as labeled by Stata) of the Stata mixed effect manual:

http://www.stata.com/manuals13/me.pdf

Also, you say in your first paragraph that there's a main effect for schizophrenia/bipolar. But your question appears to be about trajectories. You may already know this, but the model you wrote here constrains the trajectories of GAS for those with schizophrenia/bipolar disorder to be the same as the reference group. At every year, the two groups will have a constant difference in GAS. If this is about trajectories, it may be helpful to model an interaction between time and schizophrenia/bipolar as well. Unless, of course, you already did so.

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.
Comment
David Speed

Join Date: May 2015

Posts: 98
#14

22 Jan 2017, 05:33

Hi Weiwen (and Clyde),

Thanks for adding your input. From what I can tell, using a standardized outcome variable means that any B coefficient for a binary group (e.g., male/female, voter/non-voter, etc.) can be understood as "how different Group A is from Group B, in terms of units of SD for the outcome variable". In the situation that standardized values have been used, the coefficient functionally becomes an adjusted effect size.

Regarding -mixed- specifically:

This was my first time working with -mixed- and I had originally specified the model to be:
mixed gas age-sedhyp c.time alcab canab i.schzbip || time: || id: This failed to reach convergence and our statistics consultant suggested Persons were nested within time. mixed gas age-sedhyp c.time alcab canab i.schzbip || _all: r.time || id: This seemed to be helpful, and appeared to be consistent with Chuck Shuber's tutorial on -mixed- (https://www.youtube.com/watch?v=KALxDwwqX1A). However, if you think that there's an error or that we could improve on the model, I'd love to hear it.
Comment
Weiwen Ng

Join Date: Jun 2015

Posts: 1241
#15

22 Jan 2017, 13:03

David,

For your initial model:

Code:

mixed gas age-sedhyp c.time alcab canab i.schzbip || time: || id:

You were indeed specifying a random intercept for time, and that individuals were somehow nested within time periods. I don't think that model makes conceptual sense, regardless of whether or not it converged. Typically, you would see this random effects setup with patients nested within clinics, and then individual observations nested within each person. But in your study, each person experiences (or could have experienced; not sure about how much missing data there are) any one of the measurement periods. So, there's definitely no nesting.

With the model you actually fit:

Code:

mixed gas age-sedhyp c.time alcab canab i.schzbip || _all: R.time || id:

Again, this is a model with crossed random effects, and I'm not familiar with these. From what I understand from the example I cited (which should be example 10, not example 11), I think your model is assuming that each year has a random effect on each subject, but that it affects all subjects identically. It's like for each unique value of time, there's a shift in everybody's intercept that is modeled as the same for every subject. (Someone please correct me if I am wrong!!!!!) For example, say somehow GAS was strongly influenced by whether or not the economy was in recovery, so in 2010, you expect that everybody reported higher GAS levels (this is a fictitious example!!!!). Mathematically, this would look like the predicted random effect for year = 2010 was, for example, 5 points. Then, the predicted random effect for year = 2008 might have been -8 points.

If you're expecting something like that, I think that model makes conceptual sense. You have to decide that, but I'm not sure it does. In health outcomes, someone's outcomes tend to depend more on individual factors. I'd expect people to have their own individual trajectories. I wouldn't normally expect that there be major systematic external influences on anxiety (or whatever GAS is) that would need to be captured by allowing time to act as a random factor. A crossed model would likely make sense if you had various clinics where you did the measurements, individuals could visit any clinic (and usually did get measured at more than one), and you expected there to be systematic differences between clinics.

Hence, the model I proposed earlier is:

Code:

mixed gas age-sedhyp c.time alcab canab i.schzbip || id: time

In this one, you're letting the estimated slope for GAS vary by individual.

In both cases, the fixed effect estimate for the coefficient for time should be an estimate of the grand mean change in GAS per unit time for the average subject, controlling for all other characteristics. I know that's the case for the model I stated. It should also be for yours. For the Youtube link to Chuck Huber's tutorial, I reviewed it, and it looks like he only goes to states nested in regions. So, that was a 3-level random effects model, with a random intercept for state, and another random intercept for region. So, he was not fitting a model like yours.

Edit: Here's a link to some discussion about nested vs crossed:
http://www.theanalysisfactor.com/mul...andom-effects/

Last edited by Weiwen Ng; 22 Jan 2017, 13:08.

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.
Comment

Announcement

Technical/conceputal question about calculating Cohen's d after -margins'.

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment