Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculating residuals in a multinomial regression as part of 2SRI

    I am attempting to do a 2 stage residual inclusion (2SRI) model where the first stage is estimating a multinomial regression. The problem I am having is figuring out how to calculate the Pearson residual to enter into my second stage. I have seen one other unanswered post related to this question.

    Let's say my data looks like this:

    ID, provider, copay, opioid
    1, MD1, 50, yes
    2, MD2, 10, no
    3, MD1, 25, no
    4, MD3, 10, no
    5, MD2, 30, yes
    6, MD3, 14, yes

    Using MD1 as the base comparison, the first stage estimation would ideally tell me how a dollar increase in copay influences the likelihood that you see MD1 versus MD2, MD1 versus MD3, etc. In Stata speak:

    mlogit provider copay
    predict res, rstandard

    Ideally I would be able to predict the residual as above but it seems that I need to manually calculate this residual rather than rely on a postestimation option to predict the residual. And I'm pretty sure I need the Pearson residual though I am open to the fact I may be wrong on this. If someone could help me figure out the most efficient approach to the calculation, I would greatly appreciate it.

    And just to complete the thought, in the second stage, I want to know how the provider choice influence opioid outcomes, and let's assume that provider choice is influenced by copay - hence the need for the first stage.

    logit opioid provider res

    Thank you!
    Bianca

  • #2
    Hi all,

    I have the exact same doubt as above:
    I am attempting to do a 2 stage residual inclusion (2SRI) model where the first stage is estimating a multinomial regression and the second stage is estimating a binary logistic regression. I am having trouble in figuring out how to estimate the residual (v) to enter into my second stage.
    So, let's say my Stage 1 regression looks like: mlogit xe x1 i.x2 z, baseoutcome(1) [where, xe = categorical endogenous variable with 4 categories, x1 = continuous control variable, x2 = categorical control variable, z = instrumental variable]
    & my Stage 2 regression looks like: logistic y x1 i.x2 i.xe i.v [where y = categorical dependent variable with 2 categories, v = residual]

    Any leads on the Stata commands to create predicted values of xe (say, xeHat) and develop residual (v = xe - xeHat) would be highly appreciated.

    Many thanks,
    Arpita
    Last edited by Arpita Biswas; 30 Sep 2020, 15:06.

    Comment


    • #3
      So while a typical situation would be I have actual Y vs Yhat, with Y in my data set, and residuals are computed in one step. On the other hand I see that in mlogit I predict phat, though I don't have p in my data set, though I have some value, W. The relevant residual seems to be then What, computed using the phats. My guess is that that' why there's no residual option for predict following mlogit.

      Comment


      • #4
        OK your stage one xe has 4 categories but my guess is that each observation has only one category. The estimated probabilities so derived from your mlogit would then be used to compute an expectation as a function of the predicted probabilities. .Thus each observation now has an expectation so computed ,and also an actual, depicted above as i.xe. For a given observation the residual would be the difference.

        Comment

        Working...
        X