Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • marginal effects for different groups

    Hi all,

    I would like to estimate some marginal effects for different income levels. I was wondering which is the most approapriate stata code for doing this.

    (a) Run separate regression for each group (with an if condition) and estimate the marginal effects

    Code:
    regress dv iv if group==1
    margins, dydx(iv) atmeans
    and so on

    (b) Use an interaction term, like that:
    Code:
    regress dv iv##group
    margins group, dydx(iv) atmeans
    or

    (c)
    Code:
    regress dv iv i.group
    margins, dydx(iv) at(group=(1 2 3))
    Many thanks

    Nikos

  • #2
    (b) and (c) will do the same thing; in my view, (b) is preferable for reasons of clarity and brevity.

    Next, I assume that your regression example is oversimplified compared to your real situation. If all you have is the one predictor variable iv, then the -atmeans- adjustments do nothing at all, and, as it is a linear regression with no interaction (except with group if you use (b)), the marginal effect is just the coefficient of iv. So I assume you actually have several covariates in your model besides iv (and group).

    As between (a) and (b), it depends on what you are trying to estimate. They will produce different results (almost certainly), that mean different things.

    If you follow path (a), you are adjusting each marginal effect separately to the means of all the covariates (except iv) in each group. Consequently, the marginal effect you calculate for any group cannot be compared to the marginal effect you calculate the other groups because they are all adjusted to different covariate levels. These are conditional marginal effects, with each group's conditioning being different.

    If you follow path (b), then all of the groups' marginal effects are adjusted in the same way (namely, to the means of the entire estimation sample's covariate values.) The 3 group marginal effects can therefore be compared with each other.

    Depending on your research goals, you have to choose the one that fulfills them. If your research goals don't seem to identify which approach is correct, then your research goals (or your understanding of them) are too vague and require clarification.

    Comment


    • #3
      Thank you Clyde. Your advice is very useful, as usual.
      By using (c), could I argue that I estimate marginal effects for each subpopulation?


      To be honest, I cannot exactly understand the difference between (b) and (c).
      Many thanks again.

      Comment


      • #4
        To be honest, I cannot exactly understand the difference between (b) and (c).
        Good! There is no difference between them. Both commands do the same thing. I said I prefer (c) simply because it is shorter and, to my eyes, easier to understand. But Stata will give you the same response to either one.

        By using (c), could I argue that I estimate marginal effects for each subpopulation?
        We are all guilty, much of the time, of throwing around the term "marginal effects" without specifying which of the infinitely many different marginal effects are associated with any model (except simple linear models with no interaction terms--for these models, all of the infinitely many marginal effects are the same and are all equal to the regression coefficient). But when presenting your results you need to be clear about which marginal effects you are talking about. If you use the code in (c) you will be presenting the marginal effects of iv in each group, at the means in the entire sample of all the other variables in the model.

        Comment


        • #5
          Thank you Clyde. I guess I could do the following, if I want to estimate the marginal effects at different levels of the iv ?


          Code:
           
           regress dv iv i.group margins, dydx(iv) at(iv=(1(10)1000))

          Comment


          • #6
            So, first I assume you meant for these commands to be on two separate lines:
            Code:
            regress dv iv i.group
            margins, dydx(iv) at(iv=(1(10)1000))
            Again, if your entire model is -regress dv iv i.group-, then there is no need for any of the complications of -margin-. In a linear model with no interaction terms, all marginal effects (average, adjusted, conditional, whatever) are the same and they are just the coefficient of iv.

            Assuming that your real model is more complicated than that (non-linear, or linear but with interaction terms involving iv) then the above code will give you marginal effects of iv at 100 different values of iv (1, 11, 21, 31,...,991), and averaged over the observed distributions of all the other variables in the data set. These would be referred to as "average marginal effects of iv at 1, 11, 21, 31,...991."

            Comment


            • #7
              Many thanks Clyde.

              Comment


              • #8
                Hi all,

                I was just asking myself the same question as Nikos did in 2017, i.e. how to best estimate effects for different income groups, and I'm confused by the answer given here. Looking at the initial post #1, it seems to me that (a) and (b) should do the same (with b being shorter). When replicating Nikos' example with randomly generated values for the three variables, results for the margins are the same for a given group when estimating model (a) and (b), while (c) does something completly different.

                Comment


                • #9
                  Lukas, welcome to Statalist.

                  showing your code and output, using code tags, would help us to better address your Q. Otherwise we can’t assess whether you are doing something wrong or whether b and c really are different. See the Statalist faq for tips on asking questions effectively.
                  -------------------------------------------
                  Richard Williams, Notre Dame Dept of Sociology
                  Stata Version: 17.0 MP (2 processor)

                  EMAIL: [email protected]
                  WWW: https://www3.nd.edu/~rwilliam

                  Comment


                  • #10
                    Looking at the original post more carefully, I don’t think (c) is the same as (a) and (b). (C) let’s each group have a different intercept, but the slope for iv is the same for each group. In (a) and (b), both the slopes and intercepts can vary by group.
                    -------------------------------------------
                    Richard Williams, Notre Dame Dept of Sociology
                    Stata Version: 17.0 MP (2 processor)

                    EMAIL: [email protected]
                    WWW: https://www3.nd.edu/~rwilliam

                    Comment


                    • #11
                      I also usually prefer interaction effects, partly because it is rare that you want every effect to be free to differ across groups, e.g. you may think that the effect of education differs by gender but the effects of other variables are the same. You might also be interested in different groups, possibly leading you to have some interactions based on gender while others involve race. Some handouts on this are at

                      https://www3.nd.edu/~rwilliam/stats2/l51.pdf

                      https://www3.nd.edu/~rwilliam/stats2/l52.pdf

                      https://www3.nd.edu/~rwilliam/stats2/l53.pdf
                      -------------------------------------------
                      Richard Williams, Notre Dame Dept of Sociology
                      Stata Version: 17.0 MP (2 processor)

                      EMAIL: [email protected]
                      WWW: https://www3.nd.edu/~rwilliam

                      Comment


                      • #12
                        Thank you a lot Richard. This is very helpful.

                        Comment

                        Working...
                        X