Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using predictive margins in difference in difference linear regression

    I am running a difference in difference analysis using a linear regression model in which the outcome I am looking at is an ordinal variable (income categories 1 through 15 where 1 is < 5000k/year and 15 is >150k/year). My model shows a significant difference between treated and not treated groups for this outcome with a coefficient of ~.25 (representing around 1/4 of one income category). I am trying to figure out how to better capture the meaning of this difference (i.e. what does this coefficient translate to in terms of dollars of income). Is this something I can use predictive margins for?

  • #2
    Your outcome, as described, is ordinal. So having used a linear regression, all you can see is that the expected difference between treated and untreated groups is about 1/4 of one income category. If your categories are all of about the same width, say about 10k per year, then you could, very loosely as an abuse of language, say that this corresponds very roughly to an expected income difference of 2,500 per year, sort of. Even this is stretching it a lot.

    But if your income categories are not all of about the same width, then you cannot say even that. Indeed, you can't really interpret it at all in dollar terms.

    Either way, predictive margins will not rescue this problem. This problem is due to the use of linear regression with a variable that is only ordinal.

    Comment


    • #3
      Thanks, Clyde! This is as I feared and your response is very helpful. Could I used predictive margins to predict an income category for treated and untreated groups separate from my difference in difference?

      Comment


      • #4
        Well, you can calculate predictive margins, in the sense that you can run the -margins- commands and it will show you some numbers. The question is what, if anything, those numbers mean in your context.

        The problem is that predictive margins are means; they are calculated the way you would calculate the mean of any interval or ratio level variable. But the very notion of the mean of an ordinal variable is undefined. For example, if you just had two study subjects and one was in income category 5 and the other was in income category 9, the mean of those income categories is 7. But does income category 7 actually reflect being midway between income category 5 and income category 9 in any real sense? If your categories are all of equal width and evenly spaced in terms of dollars, then, yes, income category 7 is, in some sense, midway between categories 5 and 9, and so this kind of calculation of means has some interpretability. But if your categories are irregular, then it really makes no sense.

        Comment


        • #5
          Thanks. Makes sense. The income categories are not of equal width and unfortunately I don't have actual income, just income category. Based on what you are saying it seems like there isn't a good way to translate a change in mean income category to to an actual dollar change in income. Does that sound right?

          Comment


          • #6
            That sounds right. This kind of categorized income variable can be used effectively as an independent variable in regression models (treated as a polytomous categorical variable), but it is hard to use as a dependent variable. I don't know if it will be suitable for your purposes, but have you looked into ordinal logistic regression (-ologit-, in Stata)? It still won't give you dollar amounts, but presenting your results as probabilities of being in particular income categories might be suitable.

            Comment


            • #7
              Hi Clyde - just coming back to this. Thanks very much for the tip re: ordinal logistic regression. Sounds like that might be the best way to meaningfully interpret the changes.

              Comment

              Working...
              X