Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Why Mundlak and Fixed effect regression coefficient are not exactly same

    Dear Statalist,

    I have any issue regarding comparing the fixed-effect model and mundlak effect in controlling the means of time-variant variables as additional regressors.
    Here are the results. I am wondering why the coefficient of fixed and random effect is slightly different. Might it be due to missing data?

    thanks and regards,

    PHP Code:
     xtreg lremit $xlist0 i.yearvce(cluster pairidfe

    note
    comlang_off omitted because of collinearity
    note
    colony omitted because of collinearity
    note
    contig omitted because of collinearity

    Fixed
    -effects (withinregression               Number of obs      =      1102
    Group variable
    pairid                          Number of groups   =       271

    R
    -sq:  within  0.1197                         Obs per groupmin =         1
           between 
    0.5477                                        avg =       4.1
           overall 
    0.5723                                        max =         7

                                                    F
    (10,270)          =      5.71
    corr
    (u_iXb)  = 0.0948                         Prob F           =    0.0000

                                   
    (StdErradjusted for 271 clusters in pairid)
    ------------------------------------------------------------------------------
                 |               
    Robust
          lremit 
    |      Coef.   StdErr.      t    P>|t|     [95ConfInterval]
    -------------+----------------------------------------------------------------
           
    lgdpc |   .4199601   .1413403     2.97   0.003      .141691    .6982293
       lgdpc_hos 
    |   .2482984   .3412649     0.73   0.467    -.4235802    .9201771
          lcost2 
    |  -.1912492   .0978837    -1.95   0.052    -.3839617    .0014632
         lmig_st 
    |   .3740764   .1592843     2.35   0.020     .0604792    .6876735
     comlang_off 
    |          0  (omitted)
          
    colony |          0  (omitted)
          
    contig |          0  (omitted)
                 |
            
    year |
           
    2012  |   .0043988   .0190377     0.23   0.817    -.0330823      .04188
           2013  
    |  -.0265595   .0471652    -0.56   0.574    -.1194178    .0662988
           2014  
    |  -.0332107   .0542274    -0.61   0.541    -.1399731    .0735516
           2015  
    |   .1402847    .055213     2.54   0.012     .0315819    .2489875
           2016  
    |   .0661839   .0598106     1.11   0.269    -.0515705    .1839384
           2017  
    |   .0920051   .0592864     1.55   0.122    -.0247174    .2087276
                 
    |
           
    _cons |  -6.884898   5.217741    -1.32   0.188    -17.15753    3.387733
    -------------+----------------------------------------------------------------
         
    sigma_u |  1.2156288
         sigma_e 
    |  .34090766
             rho 
    |  .92708901   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------ 
    Mundlak Effect results

    PHP Code:
     xtreg lremit $xlist0 $vlist0 i.yearvce(cluster pairid

    Random-effects GLS regression                   Number of obs      =      1102
    Group variable
    pairid                          Number of groups   =       271

    R
    -sq:  within  0.1197                         Obs per groupmin =         1
           between 
    0.6579                                        avg =       4.1
           overall 
    0.6785                                        max =         7

                                                    Wald chi2
    (17)      =    847.77
    corr
    (u_iX)   = (assumed)                    Prob chi2        =    0.0000

                                     
    (StdErradjusted for 271 clusters in pairid)
    --------------------------------------------------------------------------------
                   |               
    Robust
            lremit 
    |      Coef.   StdErr.      z    P>|z|     [95ConfInterval]
    ---------------+----------------------------------------------------------------
             
    lgdpc |   .4439019   .1406352     3.16   0.002     .1682619    .7195418
         lgdpc_hos 
    |   .2524156   .3360553     0.75   0.453    -.4062407    .9110718
            lcost2 
    |  -.2028169   .0977656    -2.07   0.038    -.3944339   -.0111998
           lmig_st 
    |   .3689689   .1583542     2.33   0.020     .0586004    .6793375
       comlang_off 
    |   .0681932   .1503237     0.45   0.650    -.2264358    .3628223
            colony 
    |   -.135961   .1806898    -0.75   0.452    -.4901066    .2181845
            contig 
    |   .0056654   .3514274     0.02   0.987    -.6831196    .6944504
        lgdpc_mean 
    |  -.1943218   .1454022    -1.34   0.181    -.4793049    .0906614
    lgdpc_hos_mean 
    |  -.0928574     .34678    -0.27   0.789    -.7725336    .5868189
       lcost2_mean 
    |  -.3658746   .2326023    -1.57   0.116    -.8217667    .0900174
      lmig_st_mean 
    |    .399029   .1679837     2.38   0.018     .0697869     .728271
                   
    |
              
    year |
             
    2012  |   .0038119   .0192994     0.20   0.843    -.0340142    .0416381
             2013  
    |    -.02782   .0471031    -0.59   0.555    -.1201403    .0645004
             2014  
    |  -.0348137   .0541059    -0.64   0.520    -.1408593    .0712318
             2015  
    |   .1384236   .0552425     2.51   0.012     .0301503     .246697
             2016  
    |   .0623224   .0595451     1.05   0.295    -.0543839    .1790286
             2017  
    |    .089855   .0592783     1.52   0.130    -.0263284    .2060383
                   
    |
             
    _cons |  -7.467168   .8463628    -8.82   0.000    -9.126009   -5.808328
    ---------------+----------------------------------------------------------------
           
    sigma_u |  1.0303308
           sigma_e 
    |  .34090766
               rho 
    |  .90132614   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------- 

  • #2
    Junaid:
    the differences arose beacuse of you perforemd two different models.
    Missing vaklues cannot be the reason, as you have the same number of observations (1102) in both regression models.
    As an aside, please note that the easiest way to share what you tyoed and what Stata gave you back (as you laudably did) is via CODE delimiters:
    Code:
    #toggle available from the Advanced editor bar
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Dear Carlo Lazzaro,

      Thanks for it. However, in principles, the coefficient must be same for mundlak and fixed-effect model. As in the gravity model, we are always interested to see the effect of time-invariant variables like distance and common language, so in comparing with FE, we will also use the Mundlak effect model. Please guide.
      Last edited by Junaid Ahmed; 24 Sep 2019, 01:24.

      Comment


      • #4
        Please provide the code for how you calculated the means.

        Comment


        • #5
          Dear Eric de Souza

          bysort home: egen mean_lgdpc=mean(lgdpc)
          bysort host: egen mean_gdpc_lhos=mean(lgdpc_hos)
          bysort host home: egen mean_lcost2=mean(lcost2)
          bysort host home: egen mean_lbtrade=mean(lbtrade)
          bysort host home: egen mean_lmig_st=mean(lmig_st)

          lgdpc is home specific variable, meaning the recipient of remittances, lgdpc_hos is s host-specific variable, meaning sending country and other variables remittances cost, trade and migrant stock are the bilateral types of variables.

          Comment


          • #6
            I think that the difference is arising from the way you calculate the means. The first two egens have a different bysort compared with the last three. I see no other reason.
            I have always used the egen command for the Mundlak or CRE model with "non-gravity data in the following way:
            . egen experbar = mean(exper), by(nr)
            . egen unionbar = mean(union), by(nr)
            . egen marriedbar = mean(married), by(nr)

            I will be away the rest of the day.

            Comment


            • #7
              It’s almost certainly due to missing data. I explain this in my 2019 Journal of Econometrics paper. You should only use the complete cases when generating the time averages. Also, the time averages of the year dummies must be included in the unbalanced case.

              Comment


              • #8
                Dear Eric de Souza,

                The first two is basically the country-specific variable for both home and host, and the last three basically deal with bilateral data, that is why sorted by both host and home.

                Comment


                • #9
                  Dear Jeff Wooldridge,


                  so you mean dropped the countries with missing information, Do you not think, it will make problems. Also, could you please elaborate "You should only use the complete cases when generating the time averages. Also, the time averages of the year dummies must be included in the unbalanced case". What I understood, that we could also use the average for year dummies when working with unbalanced data. Could you please provide a Stata command for generating the time averages of year dummies.

                  Thanks

                  Comment


                  • #10
                    No, you don't drop the country. You drop any observation (indexed by country and time) where any of the variables -- dependent variable or explanatory variable -- is missing. This is what fixed effects -- or any Stata command -- does. In other words, you must use the same time periods when you compute the time averages, even if you have different time periods available for some variables. It helps to start by creating a selection indicator that is one if and only if you have a complete set of cases. Then, use it in egen.

                    Code:
                    xtset id year
                    gen s = (y != .) & (x1 != .) & ... & (xK != .)
                    egen x1bar = mean(x1) if s, by(id)
                    egen x2bar = mean(x2) if s, by(id)
                    egen xKbar = mean(xK) if s, by(id)
                    egen year2bar = mean(year2) if s, by(id)
                    egen yearTbar = mean(yearT) if s, by(id)
                    xtreg y x1 ... xK year2 ... yearT x1bar ... xKbar year2bar ... yearTbar, re vce(cluster id)

                    Comment


                    • #11
                      @Jeff Wooldridge.
                      I had completely forgotten about your paper. I do have a copy of the 2010 version.

                      Comment


                      • #12
                        Jeff Wooldridge , Thanks a lot, it perfectly works.
                        Junaid

                        Comment


                        • #13
                          Junaid - you might look at xthybrid and the Mundlak estimator that do this automatically.

                          Comment

                          Working...
                          X