Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Average marginal effects for factor variables using margins

    I've read the manual, but I'm still unclear how the margins dydx option works for factor variables (eg for logistic regression). Consider a factor variable with 2 levels and the first as reference. My understanding is that for each case the margins command calculates the predicted probability when the second level = 1, the predicted probability when the second level = 0, takes the difference and then averages this over all cases.

    How does this work with 3 level factor variables? If we are interested in the effect of level 2 relative to level 1, does it do this calculation for cases where the 3rd level is true? In other words, are the marginal effects for level 2 defined to be zero for cases where level 3 is true? Or does it simply do the counterfactual calculation in the paragraph above, ignoring the values of the other levels?

  • #2
    If your factor variable has 3 levels, the output of -margins, dydx(factor_variable)- will have two outputs. One is the marginal effect of level 2 vs level 1 and the other is the marginal effect of level 3 vs level 1.

    Comment


    • #3
      But how is the average marginal effect of level 2 vs 1 calculated? Is it calculated for cases where level = 3?

      Comment


      • #4
        But how is the average marginal effect of level 2 vs 1 calculated?
        So, in effect (perhaps literally, I have not reviewed the code in -margins-) the factor variable is set to 1 in every observation in the estimation sample (yes, including those where originally it was 3), and -predict- is applied. The average result of -predict- is the average margin for level 1. Then the factor variable is set to 2 in every observation in the estimation sample (yes, including those where originally it was 3) and -predict- is applied. The average result of predict is the average margin for level 2. Then the average marginal effect is calculated as the difference between those two average margins.

        Comment


        • #5
          OK. This makes sense - but is more complicated than I thought. I'm used to thinking of the factor variable as a set of dummies (v1 v2 v3). So when you say the factor variable is set to 1, I presume you mean that v1=1, v2=0, v3=0. When you say that the factor variable is set to 2, this means that v1=0 v2=1 v3=0. So all three level variables are being adjusted to counter-factual values - even the dummies that are not part of the comparison.

          Comment


          • #6
            So when you say the factor variable is set to 1, I presume you mean that v1=1, v2=0, v3=0. When you say that the factor variable is set to 2, this means that v1=0 v2=1 v3=0. So all three level variables are being adjusted to counter-factual values - even the dummies that are not part of the comparison.
            That is correct.

            Comment


            • #7
              You might want to look at

              https://www3.nd.edu/~rwilliam/stats/Margins01.pdf

              The description of model 4 on slide 18 is especially relevant for you.
              -------------------------------------------
              Richard Williams, Notre Dame Dept of Sociology
              StataNow Version: 19.5 MP (2 processor)

              EMAIL: [email protected]
              WWW: https://www3.nd.edu/~rwilliam

              Comment


              • #8
                Thanks for this. A very useful introduction.

                Comment

                Working...
                X