Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Logistic Regression Output with Factor Variable i.Year - Issues with Interpretation

    Hi all!

    I’m having issues interpreting the coefficients in my regression output. My model has a dichotomous dependent variable: Education (Does have a bachelor’s degree=1, 0 otherwise) and a factor variable: i.Year (for the years 2007 to 2011).
    The command:
    logistic Education i.Year, coef


    Logistic regression
    Education Coef. St.Err. t-value p-value [95% Conf Interval] Sig
    2007b 0 . . . . .
    2008 .433 .073 5.95 0 .29 .575 ***
    2009 .602 .075 8.00 0 .454 .749 ***
    2010 .217 .074 2.92 .004 .071 .362 ***
    2011 .225 .077 2.93 .003 .075 .376 ***
    Constant -.381 .052 -7.40 0 -.483 -.28 ***
    Mean dependent var 0.477 SD dependent var 0.500
    Pseudo r-squared 0.008 Number of obs 7080
    Chi-square 75.687 Prob > chi2 0.000
    Akaike crit. (AIC) 9734.809 Bayesian crit. (BIC) 9769.134
    *** p<.01, ** p<.05, * p<.1
    The coefficient of 2008 is 0.433, which suggests that the proportion of those with a bachelor’s degree from 2007 to 2008 fell by 0.433 log of odds. The standard interpretation of the logistic coefficient is "for a one unit change in variable X, we expect the log of the odds of the outcome to change by *coefficient* units, holding all other variables constant". Since my focus is on the trend of “education” between the years 2007 to 2011, it's difficult to follow the standard interpretation. Therefore, I'm unsure of what 0.433 is supposed to represent in the case of a trend. Would it be correct if I were to interpret the data as "the log odds of graduating with a bachelor’s has increased by 0.433 in 2008 in comparison to 2007? In comparison to 2007, the log odds of graduating with a bachelor’s have increased by 0.602 in 2009 etc?". My data is both nonlinear and non-normal, so the standard trend tests are not suitable. The purpose is to have a table that would indicate the changes in education between the years 2007 to 2011.

    Thank you for your help!
    Last edited by Aria Mendoza; 27 Aug 2022, 03:01.

  • #2
    Dear Aria,

    This should help you for the interpretation: https://quantifyinghealth.com/interp...-coefficients/.

    Two further things:

    - Logistic regression is a nonlinear estimation method, and assumes a logistic distribution that allows for fatter tails than the normal distribution. I am not sure if data itself can be nonlinear (statisticians may correct me) but models, equivalently functions, can be nonlinear. Nonlinear means that (conditional) means do not lie on a straight line (or at least that's my understanding).

    - In nonlinear models, generally speaking, you should be more interested in the marginal effect than the coefficient. In linear models, e.g. OLS, they are often equivalent if you have specified your model in a linear manner.
    In nonlinear models such as logit, they are not the same thing, and you may want to run the
    Code:
    margins
    command postestimation to get the marginal effect.

    Comment

    Working...
    X