Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • xtprobit and margins command

    Hi,

    I have an unbalanced panel dataset (N=2976, T=13), using survey responses.
    My dependent variable is the household's ability to save (saving=1 if able to save, 0 otherwise).
    hhid is the Household's unique identifier, and the data is yearly.

    I am computing the AMEs for my model in Stata.
    I am struggling to understand the difference between -margins, dydx(*)- and -margins, dydx(*) predict(pu0)-.
    I see that the latter assumes "Pr(saving=1 | u_i=0)" in Stata but I am unsure of what this implies and which margins method I should be using.

    I would greatly appreciate it if someone could help me to understand the difference? Many thanks

    Code:
    . xtprobit saving age, nolog
    
    Random-effects probit regression                Number of obs     =     12,951
    Group variable: hhid                            Number of groups  =      2,930
    
    Random effects u_i ~ Gaussian                   Obs per group:
                                                                  min =          1
                                                                  avg =        4.4
                                                                  max =         13
    
    Integration method: mvaghermite                 Integration pts.  =         12
    
                                                    Wald chi2(1)      =       0.07
    Log likelihood  = -6746.3674                    Prob > chi2       =     0.7969
    
    ------------------------------------------------------------------------------
          saving |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             age |   .0005127    .001992     0.26   0.797    -.0033914    .0044169
           _cons |  -.5697777    .111846    -5.09   0.000    -.7889918   -.3505635
    -------------+----------------------------------------------------------------
        /lnsig2u |   .7562628   .0613049                      .6361074    .8764183
    -------------+----------------------------------------------------------------
         sigma_u |   1.459555    .044739                       1.37445    1.549929
             rho |   .6805418    .013328                       .653873    .7060795
    ------------------------------------------------------------------------------
    LR test of rho=0: chibar2(01) = 3740.02                Prob >= chibar2 = 0.000
    
    . margins, dydx(age)
    
    Average marginal effects                        Number of obs     =     12,951
    Model VCE    : OIM
    
    Expression   : Pr(saving=1), predict(pr)
    dy/dx w.r.t. : age
    
    ------------------------------------------------------------------------------
                 |            Delta-method
                 |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             age |   .0001103   .0004288     0.26   0.797    -.0007302    .0009508
    ------------------------------------------------------------------------------
    
    . margins, dydx(age) predict(pu0)
    
    Average marginal effects                        Number of obs     =     12,951
    Model VCE    : OIM
    
    Expression   : Pr(saving=1 | u_i=0), predict(pu0)
    dy/dx w.r.t. : age
    
    ------------------------------------------------------------------------------
                 |            Delta-method
                 |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             age |   .0001767    .000687     0.26   0.797    -.0011697    .0015231
    ------------------------------------------------------------------------------

  • #2
    When you use -margins- after -xtprobit- without specifying a -predict()- option, Stata gives you its default output, which is (the marginal effect of age on) the predicted probability of saving = 1. If you specify -predict(pu0)-, then Stata gives you (the marginal effect of age on) the predicted probability of saving = 1, calculated as if the random effect were always 0.

    Remember that -xtprobit- is a random effects model. One way of thinking about that is that every household in your sample has its own, household-specific intercept (_cons) in the model, and those household-specific intercepts are assumed to be sampled from a normal distribution with mean zero (and variance that is estimated from the data.) So pu0 can be thought, loosely speaking, as not the predicted probability for the particular household, but as the median predicted probability for households with the same values of age.

    Another way of thinking of it is that the default behavior gives you the marginal effect of age on the predicted probability of saving = 1 for each household, taking into account the observed behavior of that household (which influences the random intercept). But if you specify -predict(pu0)- you are getting the marginal effect of age on a predicted probability that is predicted ignoring the actual observed saving behavior of that household and using only the fixed predictors in the model (in your case age) and no other information.

    Comment


    • #3
      Many thanks for your detailed explanation Clyde Schechter

      Originally posted by Clyde Schechter View Post
      When you use -margins- after -xtprobit- without specifying a -predict()- option, Stata gives you its default output, which is (the marginal effect of age on) the predicted probability of saving = 1. If you specify -predict(pu0)-, then Stata gives you (the marginal effect of age on) the predicted probability of saving = 1, calculated as if the random effect were always 0.
      I see, so does the -predict(pu0)- option add an additional constraint that RE=0? Is there a way to test whether this assumption can be upheld in the dataset?

      Remember that -xtprobit- is a random effects model. One way of thinking about that is that every household in your sample has its own, household-specific intercept (_cons) in the model, and those household-specific intercepts are assumed to be sampled from a normal distribution with mean zero (and variance that is estimated from the data.) So pu0 can be thought, loosely speaking, as not the predicted probability for the particular household, but as the median predicted probability for households with the same values of age.
      Is there an instance where -predict(pu0)- is preferred over the default option? Why would the median predicted probability be useful in analysis?

      Another way of thinking of it is that the default behavior gives you the marginal effect of age on the predicted probability of saving = 1 for each household, taking into account the observed behavior of that household (which influences the random intercept). But if you specify -predict(pu0)- you are getting the marginal effect of age on a predicted probability that is predicted ignoring the actual observed saving behavior of that household and using only the fixed predictors in the model (in your case age) and no other information.
      I think it might be better to use the actual observed saving behaviour and not use -predict(pu0)-, as the data is available (I hardly have any missing values for saving-, so would it be better to use this rather than the median?)

      Thanks

      Comment


      • #4
        I see, so does the -predict(pu0)- option add an additional constraint that RE=0? Is there a way to test whether this assumption can be upheld in the dataset?
        No, I wouldn't think of it that way. The way to think of it is that the pu0 option ignores the RE. It is not necessary to test that "consraint" separately: you already have that in your regression output where it says:
        LR test of rho=0: chibar2(01) = 3740.02 Prob >= chibar2 = 0.000
        That is the test of all RE = 0, and it is resoundingly rejected.
        Is there an instance where -predict(pu0)- is preferred over the default option? Why would the median predicted probability be useful in analysis?
        Good question! Suppose you want to apply your model prospectively to people who are not in your data set already. And suppose that all you are given is their age (or the other predictors) and you want to predict their probability of saving. Then you don't know what the value of the random effect is: you can only estimate that once you actually have data on their saving behavior--which means that you are not predicting, you are retro-dicting, to coin a word. So pu0 would be based on the only information available to you if you wanted to apply this model to "newcomers."

        I think it might be better to use the actual observed saving behaviour and not use -predict(pu0)-, as the data is available (I hardly have any missing values for saving-, so would it be better to use this rather than the median?)
        Again, it boils down to what you are trying to do. If you are trying to explain the effects of age in your sample, then you want to include the random effects. If you are trying to predict the average effect of age on saving among people not in your sample, then -predict(pu0)- is the fullest information available to you for that purpose.

        Comment


        • #5
          Originally posted by Clyde Schechter View Post
          No, I wouldn't think of it that way. The way to think of it is that the pu0 option ignores the RE. It is not necessary to test that "consraint" separately: you already have that in your regression output where it says:

          That is the test of all RE = 0, and it is resoundingly rejected.
          Thank you for directing me to this, I see that the null hypothesis is that RE=0 and the significant p-value means that this is certainty rejected.
          So, if all RE are not equal to 0, then would pu0 yield incorrect or biased estimates, as it is assuming RE=0 which has been shown to be rejected?

          Good question! Suppose you want to apply your model prospectively to people who are not in your data set already. And suppose that all you are given is their age (or the other predictors) and you want to predict their probability of saving. Then you don't know what the value of the random effect is: you can only estimate that once you actually have data on their saving behavior--which means that you are not predicting, you are retro-dicting, to coin a word. So pu0 would be based on the only information available to you if you wanted to apply this model to "newcomers."
          Again, it boils down to what you are trying to do. If you are trying to explain the effects of age in your sample, then you want to include the random effects. If you are trying to predict the average effect of age on saving among people not in your sample, then -predict(pu0)- is the fullest information available to you for that purpose.
          This makes a lot more sense now, thank you very much for explaining it. As I would like to explain the effects of age in my sample (I am trying to establish a causal relationship), and am not predicting for people not in the sample (newcomers), then I should use the default and not pu0.

          Thanks

          Comment


          • #6
            I also have an additional question - what is the difference between using -xtprobit, re- and -probit, vce (cluster id)- i.e. a Pooled Probit model with standard errors clustered by id?
            Can the latter be used with a panel dataset?
            Many thanks

            Comment


            • #7
              -xtprobit, re- explicitly models random intercepts at the hhid level. While the coefficients of the predictor variables are the same for all households, each household is allowed its own constant term. This allows for different households to have different "baseline" probabilities of saving--which seems realistic because the available predictor variables are unlikely to be able to completely predict something as complicated as propensity to save. When you run -probit, vce(cluster id)-, you get a one size fits all model: households are assumed to all have the same baseline propensity to save, and differences between households in savings probability are accounted for only to the extent that they are exactly determined by the predictor variables in your regression. The -vce(cluster id)- part would adjust the standard errors to account for non-independence of observations due to clustering in households, but it does not otherwise change the results of just -probit-.

              The pooled model is appropriate when you believe that all u_i are 0 (or, if not really zero, at least small enough to ignore for practical purposes.) But not only have you rejected the hypothesis that all u_i are 0 with a strongly statistically significant result, the variance component at the u_i level in your -xtprobit- output is about 1.46, which is actually very large in the scale of a probit model (where, by definition, the lowest level variance component is 1.0), and it even dwarfs the largest possible contribution of age (which would be 0.0005127*120 = about 0.06, assuming you had someone in your dataset as old as 120, which I doubt). So, in fact, in your model, the random effects u_i are in fact the biggest piece of the variation in sight. I think a pooled probit model would be a really terrible choice here.

              Comment


              • #8
                Thank you very much for the detailed explanation Clyde Schechter - it is now clear that I should not use pooled Probit

                Comment

                Working...
                X