Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Marginal Effects after Ordered probit regression

    Hello!

    I am running an ordered probit model and would like to obtain marginal effects afterwards. My outcome variable has 5 levels (strongly disagree to strongly agree) and my predictor variable measures the distance to the closest park in km.
    I am not sure how to interpret the marginal effects after my regression:

    oprobit community_meeting distParks i.education i.gender age i.living_condition degUrban i.province_dummy if excl_analysis == 0, robust cluster (district_dummy)

    margins, dydx (distParks)

    Please see attached a screenshot of the estimated marginal effects table.
    I do not understand which my reference group is as the numbers represent a change in probability as far as I understand.

    Any help is very much appreciated! Thank you so much in advance.
    Attached Files

  • #2
    You are calculating changes in probability when distance increases by 1 km from its current value, keeping all other covariates fixed. Then you take the average across the estimation sample. You do that for all 5 options.

    The comparison is not to some reference group, but the slightly closer current scenario. Looks like added distance makes agree a tiny bit more likely, 0.0006 on a [0, 1] scale. There are insignificant drops in disagrees and a nonsig increase in strongly agree. All these changes add up to 1.

    I would suggest dividing distance by 10 before fitting the model, so these effects would be in terms of a 10K change. This will make the effects less tiny since they would correspond to a bigger change. You can also do that with margins, but it’s more of a pain.
    Last edited by Dimitriy V. Masterov; 29 Jul 2023, 10:29.

    Comment


    • #3
      Double post.

      Comment


      • #4
        I'm guessing the outcome variable is an answer to something like "How likely are you to attend a community meeting?" With outcome one stating the highest confidence? As you move further from the lake, the probability of categories 1, 2, and 3 fall and those of 4 and 5 increase. Dimitriy provided the interpretation. Note that those margins add to zero, as they must. So the categories 4 and 5 become more likely as distance increases.

        I might try using log(distance) if distance >0 always.

        Comment


        • #5
          Thank you so much for your help, that makes a lot of sense! How do I know that the distance increases by 1km from its current value/ how is this distance chosen? I also run the regression with the Degree of Urbanisiation as a predictor variable, which is continuos and ranges from 1 to 3. I was wondering what the change in the level of urbanisation is for each probability change in the outcome variable? And how exactly do these changes add up to 1?

          Thank you so much for your help in advance!!

          Comment


          • #6
            quote]How do I know that the distance increases by 1km from its current value/ how is this distance chosen?[/quote]
            You do not know that the distance increases by 1 km from its current value. What the marginal effect tells you is a hypothetical situation: if you were to compare the probabilities of each response level from two people, one of whom represents the "average" responder in the data, and the other of whom lives 1 km farther from the closest park, the differences in the corresponding probabilities are expected to be what you see in the -margins- output. The reason it is 1 km is, according to what you said, the predicted variable is in units of 1 km. A marginal effect is always per unit change in the predictor variable. So the unit is whatever the unit of the predictor variable is.

            And how exactly do these changes add up to 1?
            They don't. The response in #2 to that effect is incorrect, and it was corrected in #4. They add up to zero. They have to, because the response probabilities themselves must add up to one in all situations. So if something changes (e.g. a 1 km change in distance to closest park) the response probabilities must sum to 1 both before and after. The sum of the changes in the individual probabilities must equal the change in the sum of the probabilities. Since the sum is 1 in both situations, the change in the sum of the probabilities is 0. So the sum of the changes to the individual probabilities is 0.

            Comment


            • #7
              As others already pointed out, the marginal effects add up to zero and the probabilities add up to one. I mangled that sentence, but too late to edit now.

              Comment


              • #8
                For ordered regression models like oprobit or ologit one additional point may be noteworthy...

                As the dependent variable increases from its smallest to its largest value (0 to 4 in your example) the sign of the marginal effect must change once and only once. See p. 656 in Jeff Wooldridge's textbook for related discussion:
                https://mitpress.mit.edu/97802622325...nd-panel-data/

                This restriction may or may not be important in any particular application. It does imply, however, that a change in the x-variable cannot be "spread-increasing" (e.g. can't increase the conditional probability of both the smallest and the largest values of the outcome variable) as this would imply two changes in sign.

                Note that this result applies to the marginal effect for any single observation. It does not mean that the average marginal effect reported by Stata will follow this pattern, as shown by this example:

                Code:
                set seed 2345
                
                drop _all
                
                set obs 1000
                
                gen x1=runiform()>.5
                gen x2=runiform()>.5
                gen ys=x1 + 5*x2 + rnormal()
                gen y=autocode(ys,8,-3,9)
                
                oprobit y x1 x2
                
                margins, dydx(*)
                Last edited by John Mullahy; 30 Jul 2023, 07:39.

                Comment


                • #9
                  A marginal effect is always per unit change in the predictor variable. So the unit is whatever the unit of the predictor variable is.

                  So I am wondering as my predictor variable is continuos (it is measured in km but with decimal places), what one unit change is.
                  Thank you so much for your help already!

                  Comment


                  • #10
                    A unit change always means a change by 1, regardless of the unit or the number of decimal places. A change from, say, 6.75 km to 7.75 km is a unit change: the numerical value of the variable changes by 1.
                    Last edited by Clyde Schechter; 30 Jul 2023, 10:13.

                    Comment


                    • #11
                      An addendum/clarification to #8…

                      The sample average marginal effects for the smallest (say 0) and largest (say M) values of the outcome will necessarily have opposite signs even if there is more than one sign change of the sample average marginal effects over 0 to M.

                      The reason is that, for any conditioning x,


                      DPr(y=0|x)/Dxj > 0 > DPr(y=M|x)/Dxj


                      if bj <0 and


                      DPr(y=0|x)/Dxj < 0 < DPr(y=M|x)Dxj


                      if bj >0, where D denotes either a derivative or a discrete change.

                      As such the sample average of these always-over-x negative or always-over-x positive terms must correspondingly be negative or positive.

                      Unlike the outcomes 0 and M the sign of DPr(y=m|x)/Dxj for m in {1,...,M–1} depends on the conditioning value of x.

                      So in this sense for ordered probit/logit models a dxj or ∆xj change cannot result in a change in "spread"—in this context meaning simultaneous increase or simultaneous decrease in both Pr(y=0|x) and Pr(y=M|x)—either conditionally-on-x, DPr(y=m|x)Dxj, or marginally-over-x, ExDPr(y=0|x)/Dxj.

                      Apologies for not being clearer about this when I composed #8 earlier today.

                      Comment

                      Working...
                      X