Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Correct interpretation about Average Adjusted Predictions (margins)

    Hi all,

    I am using margins command post logistic regression to get probabilities. Lamentably i have differents results using margins for Adjusted Predictions at the Means and Average Adjusted
    Predictions.

    what is the difference between this command:

    margins black , atmeans
    and
    margins black
    and
    margins, over(black)

    Could you give me a good definition about this two last command, my native language is spanish and probably for this reason i have problem to undserstand the differences. Two last are incorrect way to get probabilities??

    Here i add some example about my problem:

    webuse nhanes2f, clear

    logit diabetes i.black i.female age, nolog

    margins black , atmeans
    margins black
    margins, over(black)

    Thanks in advance
    Regards
    Last edited by Rodrigo Badilla; 04 Sep 2016, 18:17.

  • #2
    Let's start with -margins black-. First, Stata goes through the data, setting black = 0 in every observation in the data set, and uses -predict, pr- to calculate a predicted probability for each observation as if it were non-black but otherwise the same as it originally was. Then it averages those predictions and outputs the average. Then it goes back and sets black = 1 in every observation, and re-calculates all the individual predicted probabilities, averages all of those, and outputs that average. So the outputs of this command are the average predicted probabilities that would be observed in your study population if everyone were black, or if everyone were not black, all else being left as it was in the data.

    Now let's look at -margins black, atmeans-. This time, Stata starts by calculating the mean values of every variable in the model except black (in this case, female and age), and sets all of those variables equal to their means. Then once again it starts with setting black = 0 in every observation, calculates predicted probabilities for each individual, and averages those (except that since they are all the same, the average is equal to every one of the individual predicted values), and gives that as the output. Then it goes back and resets black to 1 in every observation, again calculates predicted probabilities for each individual (again, they are, in this case, all the same), and outputs the average of those. So these outputs can be interpreted as the predicted probabilities for black and non-black individuals if they were all average in all other respects. This can be a little strange for categorical variables like female: the calculations treat each person as if he/she were 52% female and 48% male. Anyway, these are the probabilities you would expect to observe if everyone were exactly average in all respects, but were black:non-black.

    Finally -margins, over(black)- is different. Stata goes through the data set and calculates the predicted probabilities of each person in the data set using their data exactly as is: nothing is changed. Then it averages the predicted probabilities for blacks, and averages the predicted probabilities for non-blacks and outputs those averages. So in this case, the black and non-black outputs are based only on the data for blacks and only on the data for non-blacks respectively.

    FWIW, at least in my field, epidemiology, we mostly use the -margins black- version, which gives us expected probabilities for a population under different conditions. Sometimes, but less often, we are interested in predicted probabilities for a single person who is, in all respects, average, but under different conditions. For that, the -margins black, atmeans- command serves. I have yet to encounter any uses of -margins, over(black)- in practice, though I'm sure that it does come up in some situations.

    I hope this helps.

    Comment


    • #3
      Thanks Clyde for your detailed reply...

      I just want to use margins command to epidemiology field, lamentably the literature is not clear about this topic and some of them not recomend use Average Adjusted Predictions for explain logistic model.

      I appreciate your focus in your area, this is a good guide for me...

      Regards
      Rodrigo

      Comment

      Working...
      X