Mediation for skewed continuous outcomes using khb command

Mahham Humza

Join Date: Feb 2018

Posts: 5
#1

Mediation for skewed continuous outcomes using khb command

07 Feb 2018, 00:45

Hi all,

I am using the khb command to test the mediation of effects of one categorical nominal variable through two different categorical mediators (one binary and one ordinal) on a continuous outcome variable. Following are the details of my variables:

exposure x: ethnicity (nominal, 5 levels)
mediator1 z1: SES (ordinal, 3 levels)
mediator 2 z2: Self reported psychological stress (binary)
outcome y: Triglycerides (positively skewed)

The khb method utilizes the linear probability model for continuous outcomes.

1. I was wondering if I can get reliable results if I use the skewed continuous outcome without transformation (for ease of interpretation)?
2. If I use robust standard errors would that solve the problem of reliability of results? My sample size is quite large (n=6600) and from my understanding the distribution of outcome variables doesn't matter as long as the distribution of residuals is normal.
3. Would standardization of the continuous outcome resolve the issue? If so, can the mediation effects like P-Reduced be interpreted in the same manner (percentage of the total effect mediated)?

I would be grateful for any help or alternative suggestions.
Thank you!

Last edited by Mahham Humza; 07 Feb 2018, 01:00.
Tags: categorical mediators, khb, linear regression, mediation, skewed outcome
daniel klein

Join Date: Mar 2014

Posts: 3862
#2

07 Feb 2018, 01:49

khb is community-contributed software from SJ or SSC (it is advisable to state where you have it from). Conceptually, I believe the most important point is

from my understanding the distribution of outcome variables doesn't matter as long as the distribution of residuals is normal

That is correct, concerning OLS point estimates and standard errors. Given the large sample, I would even say that it probably suffices if the residuals are "normalish", i.e., somehow bell-shaped. In my opinion, the more relevant question is whether a (conditional) mean describes the outcome well, i.e., whether a linear model is useful approximation of the data-generating process.

However, there seems to be a misconception regarding

The khb method utilizes the linear probability model for continuous outcomes.

A LPM is just a linear regression model with a binary outcome. Since your outcome is continuous, I do not see that the term LPM applies here.

Best
Daniel
Comment
Mahham Humza

Join Date: Feb 2018

Posts: 5
#3

07 Feb 2018, 03:11

Thank you for your prompt and helpful response Daniel!

khb is community-contributed software from SJ or SSC (it is advisable to state where you have it from). Conceptually, I believe the most important point is

The software I am using is from SJ, specifically (SJ13-1: st0236_2)

A LPM is just a linear regression model with a binary outcome. Since your outcome is continuous, I do not see that the term LPM applies here.

Sorry about not being clear earlier. What I meant was that my output states "Decomposition using Linear Probability Models" before the output table. What could this mean in the context of my model then?

The code I am using is as follows:

Code:

khb regress y i.x || z1 z2 concomitant( c ) disentangle

(where c refers to the confounding variables)

Also, this might seem like a silly question but just to confirm, there is no point in standardizing the exposure and mediator variables when the outcome is continuous even if the mediators are categorical, correct?
Sorry if my questions are a little basic, but I don't have a stats background so it's more difficult to wrap my head around some concepts.

Thanks!
Comment
daniel klein

Join Date: Mar 2014

Posts: 3862
#4

07 Feb 2018, 04:50

What I meant was that my output states "Decomposition using Linear Probability Models" before the output table. What could this mean in the context of my model then?

It does not mean a lot - it means that a linear regression model was used, which is what you want.

Also, this might seem like a silly question but just to confirm, there is no point in standardizing the exposure and mediator variables when the outcome is continuous even if the mediators are categorical, correct?

Not a silly question, but standardizing (I suppose you mean centering at the mean and dividing by SD) does not make a lot of sense for categorical variables in general, as mean and SD are not well suited concepts for describing categorical variables.

Best
Daniel
Comment
Mahham Humza

Join Date: Feb 2018

Posts: 5
#5

07 Feb 2018, 04:56

That's true, seems obvious now that you mention it!
Thanks a lot, much obliged.

Regards,
Mahham
Comment
Doug Hemken

Join Date: Jul 2014

Posts: 219
#6

07 Feb 2018, 05:12

khb was designed for non-linear models. In this usage, the reference to LPM is misleading.

Doug Hemken
SSCC, Univ. of Wisc.-Madison
Comment
Mahham Humza

Join Date: Feb 2018

Posts: 5
#7

08 Feb 2018, 08:32

Hi all,

Is there a way to get the overall indirect effect through a categorical mediator using khb? e.g: (for the above described data) if I want to see the overall mediation caused by SES on the outcome, and not how much each level of SES contributes to the mediation, is there a way to see that? The disentangle option decomposes the mediator's effect to each level (like effect mediated by 2.SES, 3.SES and so on).

I am interested in the percentage mediated (P_Diff and P_Reduced) more than the coefficient itself. Is it correct to add up the percentage mediated by each level of the categorical mediator to get an overall mediated effect for that mediator?

Thanks!
Comment
daniel klein

Join Date: Mar 2014

Posts: 3862
#8

08 Feb 2018, 09:22

It is more complicated that you might think. Since your exposure variable (key-variable in khb terminology) has 5 levels, the estimated coefficients are differences relative to a chosen reference category; and so is the estimated proportion of mediation. That means when you summarize the percentages of mediation, the summary can still only be interpreted given the specific reference category. See Karlson (2013) for a solution to the problem. The article includes Stata code that the author has used; but it might be cumbersome to adept to your needs.

Best
Daniel

Karlson, K. B. 2013. Summarizing primary and secondary effects. Research in Social Stratification and Mobility, 33: 72-82.
Comment
Mahham Humza

Join Date: Feb 2018

Posts: 5
#9

08 Feb 2018, 13:23

Thank you Daniel for the very interesting and informative reference. However, what I am trying to achieve is slightly different from the author's objective. The author summarizes the indirect effect by creating a latent variable from the 7 levels of the key/exposure variables to facilitate comparison with another cohort, whereas I want to summarize the indirect effects from one of the mediator variables while keeping the distinction between the different classes/levels of my key/exposure variable (ethnicity). I think I might go with his first suggestion of combining the classes for SES...

it might be cumbersome to adept to your needs

yes, i think the coding involved is beyond my capabilities and will likely result in errors!

That means when you summarize the percentages of mediation, the summary can still only be interpreted given the specific reference category.

So in the context of my data, is the interpretation of the percentages somewhat like that of multinomial regressions for categorical variables, with percentage instead of odds? (for the sake of understanding) e.g: the percentage of indirect effect mediated by stress compared to no stress, for 2.Ethnicity relative to 1b.Ethnicity ?

Thanks a lot for your input!
Comment

Announcement

Mediation for skewed continuous outcomes using khb command

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment