Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Fixed effects probit and logit models with marginal coefficients evaluated at means


    I have some survey data and a dummy outcome variable. The data includes the region in which each individual lives. I'd like to use this for fixed effects with regions. Ideally, I'd like to use a probit fixed effects model with marginal coefficients evaluated at their means because this what make my results directly comparable to another paper's and allow a useful comparison.

    However, I've read on this forum that fixed effects with probit is biased. I then planned on using a fixed effects logit as a robustness check but I've read that these results can't be presented as marginal coefficients evaluated at their means. I still intend on using a fixed effect probit if possible so that I can compare my results. As I don't think Stata allows for FE probit, I've used dummy region variables. Does this have the same effect as a fixed effect probit?

    Is it correct that I cannot use a fixed effects logit with marginal coefficients, or is there a code that is suitable for this? Would using dummy variables for each region be a suitable way around using the fixed effects command, and therefore being able to evaluate as marginal coefficients at means?

    Below, is the code I was planning on using. As there is no FE probit available, I've just included dummies for regions.

    Probit code:

    Code:
    probit TRUST rage rage2 female married fulltime parttime highinc i.regions
            
    margin, dydx(rage rage2 female married fulltime parttime highinc i.regions) atmeans
    Logit code:

    Code:
    clogit TRUST rage rage2 female married fulltime parttime highinc, group(regions)
            
    margin, dydx(rage rage2 female married fulltime parttime highinc) atmeans
    Thanks in advance.

  • #2
    Dear Sally Kennedy,

    How many observations do you have for each region? More precisely, what is the minimum number of observations in a region?

    Best wishes,

    Joao

    Comment


    • #3
      Joao Santos Silva Thanks for your quick response. The minimum number of observations in a region is 190. There are 12 regions with a total of 5780 observations.

      To be precise:
      Observations in each region are 190, 291, 313, 320, 449, 491, 530, 539, 541, 542, 735, 839.

      I'm also increasing the control variables in different regressions to a maximum of 15.

      Thanks,
      Sally

      Comment


      • #4
        As Joao was alluding to, there is no problem with including regional dummies when those are the sample sizes. Your confusion is, regrettably, a sloppiness with how the phrase "fixed effects" is used these days. The term has a very specific meaning in panel data contexts: it means allowing unobserved heterogeneity at the unit level. So, putting in a dummy variable for every individual in a probit or logit is a bad idea (unless T is pretty large) because of the incidental parameters problem. But putting in 12 regional dummies causes no problem with those sample sizes.

        I've been lamenting the use of terms like "regional fixed effects," "occupational fixed effects," and even "race fixed effects" for some time. These are not "fixed effects" in the usual sense. We used to just call them dummy variables.

        Comment


        • #5
          Dear Sally,

          Thank you for providing the additional information. I hope all is clear now.

          Best wishes,

          Joao

          Comment


          • #6
            Thanks for your help.

            I'm not sure if I was clear in my first post but I have cross-sectional data. In cross-sectional data, are you saying that what I have wrongly called fixed effect probit is the same as a probit with dummy variables for regions?

            Additionally, clogit produces slightly different results than the logit with regional dummy variables. Is this expected?

            Thanks.

            Comment


            • #7
              Sally: Yes, in my view, the name "fixed effects" is being regularly misused. The name has positive connotations because of its robustness in panel data and clustering contexts, and so it has been used in other situations. My assumption that you were using panel data is a good case in point.

              Where does it end? Should we say "gender fixed effects" when we put in an indicator for gender? If we put income into different bins and define dummy variables have we used "income fixed effects"? Sorry, it's a pet peeve of mine.

              Another case in point about the confusion using the name "fixed effects": clogit is not the same as putting in dummy variables. It is a conditional MLE that removes the heterogeneity using a conditioning argument. If you had many regions and few people per region, you would not want to simply put in the region dummies. That would be the incidental parameters problem. You would use clogit or a correlated random effects probit. For you setting, just put the dummy variables into probit or logit.; there is no reason to use the more restrictive assumptions underlying clogit.

              Jeff

              Comment


              • #8
                Dear all, thank you for a very helpful discussion. I just wanted to confirm that the same reasoning applies to including year dummies in probit models on cross-sectional survey data (unit of observation-one individual). In those cases (assuming a sufficient number of individuals per country/year, including countries or year dummies in a probit model should not be a problem, correct?

                Comment

                Working...
                X