Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Adjusting for multicollinearity when dealing with mlogit and survey weights

    Hello!

    I am writing a paper regarding the relationship between sexual orientation and risk behaviors based on the 2015 Youth Risk Behavior Survey (CDC).
    I successfully created a table based on the interaction of each of my dependent variables and the sexual orientation identifying response, but when attempting to run mlogit, I hit on two variables which are omitted due to multicollinearity:

    qn11 | -.126644 .0739262 -1.71 0.087 -.2716207 .0183326
    qn12 | .0182417 .0329616 0.55 0.580 -.0463992 .0828826
    qn29 | -.2315899 .0690746 -3.35 0.001 -.3670522 -.0961276
    qn59 | -.1794645 .0439278 -4.09 0.000 -.2656113 -.0933177
    q60 | 0 (omitted)
    qn63 | 0 (omitted)
    qn89 | .0556139 .0397618 1.40 0.162 -.022363 .1335908
    qnothhpl | -.0052826 .0420503 -0.13 0.900 -.0877475 .0771823
    qndualbc | -.0796317 .066723 -1.19 0.233 -.2104822 .0512188
    qnbcnone | -.0852495 .0521206 -1.64 0.102 -.1874633 .0169643

    This output was created with the commands:
    Code:
    regress sexid age male qn9 qn11 qn12 qn29 qn59 q60 qn63 qn89 qnothhpl qndualbc qnbcnone black hisplat other
    I understand where the issue is, as q60 is "ever having sexual intercourse" and qn63 is "currently sexually active." Naturally, one of these is perfectly 0, which I believe is causing the problem. However, the both need to be present in the model. How do I adjust my tests for this? Should I be analyzing using a different type of test? It obviously needs to be accounted for, as those variable are not insignificant with regard to the outcomes of interest.

    I have tried using estat vif after a svy: regression for these, and, as I've seen in other posts, included something for [pw], but it didn't change the outcome.

    How should I proceed?
    Last edited by Max Souders; 26 Jul 2017, 00:41.

  • #2
    I don't think that that is the problem, as in that case only 1 of them would be omitted and not both. Also you say you want a multinomial logit, but you estimate a linear regression.
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      To add to Maarten's comment, you say "one of these is perfectly 0," What does this mean? If the variable is zero for everyone in the sample, it cannot be used. If everyone has had sex, then having had sex cannot tell you anything about risk behaviors. Its like you have a sample of all men and want to use sex as a variable - it just doesn't work. There is no variance in the variable to associate with variance in the dv. In theory, you could omit the constant, but then you have real problems in interpretation.

      You don't say, but I assume the estat vif indicated problematic levels of colinearity (estat vif doesn't fix anything - it is strictly diagnostic). I'd also look at the correlation between the two variables.

      There is no adjustment of tests per se. The problem with colinearity is the variables are so similar that you have trouble accurately identifying their individual effects, but (as long as you get estimates - not the case in your data) this should appear in high standard errors. As long as you can get estimates, the assumptions are not violated (generally) so you can use tests normally. They just will tend to be insignificant due to high standard errors.

      There are very few great solutions to colinearity. Some like techniques that make x's orthogonal, but I haven't studied this enough to know its real implications.

      Comment

      Working...
      X