Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • APC with control variables

    So I've been working off and on on this APC model using the apc_ie module that some of you kindly helped my modify for survey data. It works great. I've done a lot of statistical modeling but I'm a newb at APC analysis, so some of this is a challenge. One thing many authors do is plot probabilities of the age, period and cohort variables. This is done by setting two of the variables zero which seems strange to me, but that's the common practice. So, for example, age is plotted while period and cohort are set to zero. Another thing I'd like to do is include sociodemographic control variables as demonstrated here: http://www.ncbi.nlm.nih.gov/pmc/arti...011.300602.pdf. The last paragraph of the statistical analysis section explains how they did this. They say they held IVs at their means when computing the probabilities. What I don't understand is they have numerous categorical variables. How does one hold a categorical variable at its mean? I suppose I can try to track down the authors this week, but if anyone has any insight in the meantime, I'd appreciate it.

  • #2
    I haven't heard back from the authors but I think that there is a relatively easy way of doing this. The apc_ie module doesn't save estimates needed to run margins or marginsplot, but the probabilities can be computed with nlcom, a strategy I've used in the past. The apc_ie module treats APC effects as categories. It's based on glm and the parameter estimates are saved in r(table). Suppose I have age13, age15, age19. peroid07, period10, period13, cohort92, cohort95, cohort98. A real example would have a couple of more cohort, but we'll keep it simple for now. For age13 I could write:

    Code:
    nlcom [p: invlogit  (_b[_cons]+age13*1+_b[period10]*1+_b[period13]*1+_b[cohort95]*1+_b[cohort98]*1)
    I like this better than setting period and cohort to zero as others have done. There is an actual adjustment of age for period and cohort effects, just referenced at the first category of each ( period07, cohort92).

    I could do th do the same for age15 and age19:

    Code:
    nlcom [p: invlogit (_b[_cons]+age15*1+_b[period10]*1+_b[period13]*1+_b[cohort95]*1+_b[cohort98]*1)
    nlcom [p: invlogit (_b[_cons]+age19*1+_b[period10]*1+_b[period13]*1+_b[cohort95]*1+_b[cohort98]*1)
    Now say I wanted to add race (three categories) and income (continuous).

    Code:
    nlcom [p: invlogit (_b[_cons]+age13*1+_b[period10]*1+_b[period13]*1+_b[cohort95]*1+_b[cohort98]*1+_b[race2]*1+_b[race3]*1+_b[income]*mean)
    This would be referenced at race1. I would substitute the mean income for mean. This approach would not be as good as marginal probabilities, but seeing that there are issues with interpretation of the marginal effects of continuous variables anyway, maybe that's the best that can be done. It's at least consistent with the paper I cited. Any comments or thoughts? Sound reasonable?

    Comment

    Working...
    X