Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bootstrapping standard errors for logit interaction term

    Dear all,

    for a robustness check in a logit model, I want to test a model which would require the clustering with a low number of clusters (~30). While I found an appropriate package for STATA to calculate standard errors for such cases (boottest), I am currently a bit stuck.

    The (simplified) model with a dummy-continuous variable interaction term looks like this:
    Code:
    logit funding population tax_revenue i.party##c.vote_share i.region i.year, vce(cluster municipality_ID)
    I want to assess whether the interaction is statistically significant (i.e. whether differences between groups are statistically different, or at least making a statement about the average marginal effects for one of the groups in the interaction terms). As I understood logit models, testing simply the interaction term (i.party#c.vote_share), by difference to OLS, is not sufficient for any statement about significance. Therefore I guess the following command would not produce the desired information

    Code:
    boottest 1.party#vote_share, cluster(region)
    Does anyone have an idea (with boottest or any other package) how to address this issue?

  • #2
    ...testing simply the interaction term (i.party#c.vote_share), by difference to OLS
    What does this mean?

    If you think your -logit- model is reasonable, I don't get why you don't just use the significance test for the interaction term that comes on the -logit- output. If for some reason you don't like using Wald tests, you could also just re-run the -logit- leaving out the interaction term and then run -lrtest- to get the real deal, the likelihood-ratio test.

    Finally, if you really want a bootstrap standard error, -logit- does allow the -vce(bootstrap)- option.

    I don't know the -boottest- command, as it is not part of official Stata and not one I have installed or used. Perhaps your question relates in some specific way to that command and I am missing your point?

    Comment


    • #3
      Dear Clyde,

      thank you very much for your reply.

      I got a bit confused by Ai/Norton (2003: 124) who write that "the statistical significance of the interaction effect [in logit/probit models] cannot be tested with a simple t-test on the coefficient of the interaction term b12". Therefore my understanding was that I cannot simply look at i.party#c.vote_share, or do I overcomplicate things?

      I was relying on boottest instead of vce(bootstrap) as after reading a lot of literature my understanding was that this is necessary given the small number of clusters (~30). The boottest command employs the score bootstrap (for nonlinear models) which builds on the wild bootstrap method which is more adequate for a small number of clusters.

      Comment


      • #4
        Florian: I think part of your confusion might be with respect to "interaction effect" vs "interaction term." In a linear model they are the same; in a nonlinear model (e.g. logit) they generally will differ. In the nonlinear setting neither the interaction effect nor its significance can be assessed simply by looking at the interaction term (b12) and its significance. The most straightforward way to understand the interaction effect and its significance is to use margins after logit; in general there would be no need to bootstrap if you do so.

        Comment


        • #5
          Dear John,
          thanks for your reply. Maybe I was a bit unprecise with interaction effect/term and displayed the problem in a too simplfied and shortened way. What I normally would be doing is to run the logit and use then the margins command to investigate the interaction effect
          Code:
           
           logit funding population tax_revenue i.party##c.vote_share i.region i.year, vce(cluster municipality_ID) margins, dydx(vote_share) over(i.party) vce(unconditional)
          The problem I have is that when clustering at a different level (region) the number of clusters is getting too small and therefore are at risk to produce downward-biased standard errors. For this reason I aimed at using score bootstrap, which is more suitable for small clusters.

          My naive understanding is that one would need to correct standard errors either prior to calculating margins, or after, as standard errors produced by margins itself are incorrect. Would it be necessary to correct standard errors prior to using margins, or bootstrapping the calculated average marginal effect? I also tried the latter, however, do not manage to get boottest working with posted margin results.
          Code:
          quietly logit funding population tax_revenue i.party##c.vote_share i.region i.year, vce(cluster municipality_ID)
          quietly margins, dydx(vote_share) over(i.party) vce(unconditional) post
          boottest 1.party#vote_share
          (note: constraint vote_share caused error r(111))

          Comment


          • #6
            Re #4. I think that the handling of interactions in non-linear models is controversial. I happen to agree with John Mullahy 's advice that this is best handled using -margins- and then contrasting the different predicted margins. But you can find authors who assert strongly that this is misguided and that the "true" test of the interaction is based on the t-test of the interaction term in the regression output. I think one has to acknowledge both sides of this disputed area.

            This seems to be an area where people have some pretty strong, and diametrically opposed opinions. For my part, I think the difference is semantic, and that the choice of approach depends on the specific question being posed. My personal experience is that the question being posed is typically the one answered by using -margins-, though there could be real-world circumstances where the opposite is true.

            Comment


            • #7
              Dear Clyde,
              so if I understand your post correctly, you would suggest to just rely on margins (and as a consequence rather try clustering up to a level that is still considered as reasonable, i.e. making sure that there are at least ~ 50 clusters)?

              Comment


              • #8
                Yes. I wouldn't make 50 a rigid boundary: there is neither theoretical nor extensive simulation-based research, nor even expert consensus, on what the minimum number of clusters needs to be to use vce(fe). I think nearly everyone would agree that 50 is sufficient. Some people would be happy with 30 or 20. Below that fewer people find it acceptable.

                Comment


                • #9
                  Dear Clyde, thank you very much - this is very good to hear!

                  Comment

                  Working...
                  X