Dear everyone,
I am using 3 labor force surveys years 1994, 2000, and 2012 of 1 country and running logit regressions using the same variables in all 3 datasets.
I am now in the middle of interpreting the results (specifically marginal effects). However, I am unsure whether I can make some conclusions on whether the effect of a certain variable X has improved/worsened the probability of y=1 over the years.
I read in Mood (2010 p.73) (reference below), that it is problematic to compare coefficients/odds ratios across different logit models (even when using the same independent variables) due to potential differences in the predictions of effects of the models and unobserved heterogeneity.
Does this problem extend to the comparisons of marginal effects across the 3 survey years as well?
For example, is it wrong for me to say the following statement when interpreting the Average Marginal Effects:
"In 1994, the effect of being male relative to being female increases the probability of y=1 by 15 percentage points. However, this has declined by 2012 where the effect of being male relative to being female only increases the probability of y=1 by 7 percentage points. This may imply that gender equality has improved over the years."
Is it more appropriate to only say:
"In 1994, the effect of being male relative to being female increases the probability of y=1 by 15 percentage points, whereas in 2012, the effect of being male relative to being female increases the probability of y=1 by 7 percentage points."
Thank you very much for your help!
Best,
Kim
Reference:
Mood, C. 2010. Logistic Regression: Why We Cannot Do What We Think We Can Do, and What We Can Do About It. European Sociological Review, 26(1), 67-82.
I am using 3 labor force surveys years 1994, 2000, and 2012 of 1 country and running logit regressions using the same variables in all 3 datasets.
Code:
logit y i.sex i.age_grp i.sector i.education i.marital i.urban if working==1 [pw=weight], cluster(fsu)
I read in Mood (2010 p.73) (reference below), that it is problematic to compare coefficients/odds ratios across different logit models (even when using the same independent variables) due to potential differences in the predictions of effects of the models and unobserved heterogeneity.
Even if the models include the same variables, they need not predict the outcome equally well in all the compared categories, so different ORs or LnORs in groups, samples, or points in time can reflect differences in effects, but also differences in unobserved heterogeneity. This is an important point because sociologists frequently compare effects across, e.g. sexes, ethnic groups, nations, surveys, or years.
For example, is it wrong for me to say the following statement when interpreting the Average Marginal Effects:
"In 1994, the effect of being male relative to being female increases the probability of y=1 by 15 percentage points. However, this has declined by 2012 where the effect of being male relative to being female only increases the probability of y=1 by 7 percentage points. This may imply that gender equality has improved over the years."
Is it more appropriate to only say:
"In 1994, the effect of being male relative to being female increases the probability of y=1 by 15 percentage points, whereas in 2012, the effect of being male relative to being female increases the probability of y=1 by 7 percentage points."
Thank you very much for your help!
Best,
Kim
Reference:
Mood, C. 2010. Logistic Regression: Why We Cannot Do What We Think We Can Do, and What We Can Do About It. European Sociological Review, 26(1), 67-82.
Comment