Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How I Do I Calculate Post-Estimation Margins/ Predicted Probabilities After Running A Sub-Population Analysis?

    Can someone help me calculate predicted probability after running a sub-population analysis using logistic regression. For example,

    I just ran the following:

    svy, subpop (hispanicorigin): logistic mammogramadherent i.legalstatus
    svy, subpop (hispanicorigin): logistic mammogramadherent i.legalstatus i.age i.education i.employment i.married i.income
    svy, subpop (hispanicorigin): logistic mammogramadherent i.legalstatus i.age i.education i.married i.employment i.income i.healthcare i.insurance


    margins legalstatus, at (age =(0 1 2 3 4 5 )) vsquish post vce(uncond)
    marginsplot, ///
    legend(rows(1) symxsize (5) region(fcolor(none)lcolor(none)) position(6)) ///
    ytitle (Predicted probabilities with 95% CI, size(medlarge)justification(center) color (black)) ////
    title (Citizenship Status , size(large)justification(center) color (black))

    However, I got the following error message:


    . margins legalstatus, at (age =(0 1 2 3 4 5 )) vsquish post vce(uncond)
    at values for factor age do not sum to 1
    r(198);

    Can someone please tell me the proper syntax?


  • #2
    It would be very helpful to see the output of the regression itself. If my guess of an answer is wrong, please post back with that, as well as example data (use -dataex-).

    I'm going to guess that within the data subset where hispanic origin turns out to be true, after accounting for the removal of any observations where the values of any of the regression variables are missing, that not all of the ages 0, 1, 2, 3, 4, and 5 are instantiated. This would show up in the regression output, because Stata will tell you that it has dropped one or more of the corresponding indicators because of colinearity, and the corresponding row in the coefficient table will show that coefficient as "(empty)."

    Comment


    • #3
      Hi, Clyde:

      Thanks for your response. Here is an example of my data.

      dataex mammogramadherent legalstatus in 1/15

      ---------------------- copy starting from the next line -----------------------
      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input byte mammogramadherent float legalstatus byte hispanicorigin
      0 2 1
      0 . 0
      1 . 0
      1 . 0
      0 . 0
      1 . 0
      0 . 0
      0 . 0
      0 . 0
      0 . 0
      1 . 0
      1 . 0
      0 . 0
      0 . 0
      1 . 0
      end
      ------------------ copy up to and including the previous line ------------------

      Listed 15 out of 8143 observations

      Comment


      • #4
        Well, if this really resembles your data, it's hopeless. You have almost no information on the variable legalstatus, so your regression is going to reduce to a pathetic handful of observations, many combinations of hispanicorigin and age will be unattested in the data, and, unsurprisingly, you will not have any meaningful results.

        I should also point out that even if there were no missing data at all in this example, it doesn't shed any light on the problem you originally raised since it doesn't include the age variable.

        Comment

        Working...
        X