Why Mundlak and Fixed effect regression coefficient are not exactly same

Junaid Ahmed

Join Date: Apr 2019

Posts: 39
#1

Why Mundlak and Fixed effect regression coefficient are not exactly same

24 Sep 2019, 00:03

Dear Statalist,

I have any issue regarding comparing the fixed-effect model and mundlak effect in controlling the means of time-variant variables as additional regressors.
Here are the results. I am wondering why the coefficient of fixed and random effect is slightly different. Might it be due to missing data?

thanks and regards,

PHP Code:

xtreg lremit $xlist0 i.year, vce(cluster pairid) fe note: comlang_off omitted because of collinearity note: colony omitted because of collinearity note: contig omitted because of collinearity Fixed-effects (within) regression Number of obs = 1102 Group variable: pairid Number of groups = 271 R-sq: within = 0.1197 Obs per group: min = 1 between = 0.5477 avg = 4.1 overall = 0.5723 max = 7 F(10,270) = 5.71 corr(u_i, Xb) = 0.0948 Prob > F = 0.0000 (Std. Err. adjusted for 271 clusters in pairid) ------------------------------------------------------------------------------ | Robust lremit | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- lgdpc | .4199601 .1413403 2.97 0.003 .141691 .6982293 lgdpc_hos | .2482984 .3412649 0.73 0.467 -.4235802 .9201771 lcost2 | -.1912492 .0978837 -1.95 0.052 -.3839617 .0014632 lmig_st | .3740764 .1592843 2.35 0.020 .0604792 .6876735 comlang_off | 0 (omitted) colony | 0 (omitted) contig | 0 (omitted) | year | 2012 | .0043988 .0190377 0.23 0.817 -.0330823 .04188 2013 | -.0265595 .0471652 -0.56 0.574 -.1194178 .0662988 2014 | -.0332107 .0542274 -0.61 0.541 -.1399731 .0735516 2015 | .1402847 .055213 2.54 0.012 .0315819 .2489875 2016 | .0661839 .0598106 1.11 0.269 -.0515705 .1839384 2017 | .0920051 .0592864 1.55 0.122 -.0247174 .2087276 | _cons | -6.884898 5.217741 -1.32 0.188 -17.15753 3.387733 -------------+---------------------------------------------------------------- sigma_u | 1.2156288 sigma_e | .34090766 rho | .92708901 (fraction of variance due to u_i) ------------------------------------------------------------------------------

Mundlak Effect results

PHP Code:

xtreg lremit $xlist0 $vlist0 i.year, vce(cluster pairid) Random-effects GLS regression Number of obs = 1102 Group variable: pairid Number of groups = 271 R-sq: within = 0.1197 Obs per group: min = 1 between = 0.6579 avg = 4.1 overall = 0.6785 max = 7 Wald chi2(17) = 847.77 corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000 (Std. Err. adjusted for 271 clusters in pairid) -------------------------------------------------------------------------------- | Robust lremit | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------------+---------------------------------------------------------------- lgdpc | .4439019 .1406352 3.16 0.002 .1682619 .7195418 lgdpc_hos | .2524156 .3360553 0.75 0.453 -.4062407 .9110718 lcost2 | -.2028169 .0977656 -2.07 0.038 -.3944339 -.0111998 lmig_st | .3689689 .1583542 2.33 0.020 .0586004 .6793375 comlang_off | .0681932 .1503237 0.45 0.650 -.2264358 .3628223 colony | -.135961 .1806898 -0.75 0.452 -.4901066 .2181845 contig | .0056654 .3514274 0.02 0.987 -.6831196 .6944504 lgdpc_mean | -.1943218 .1454022 -1.34 0.181 -.4793049 .0906614 lgdpc_hos_mean | -.0928574 .34678 -0.27 0.789 -.7725336 .5868189 lcost2_mean | -.3658746 .2326023 -1.57 0.116 -.8217667 .0900174 lmig_st_mean | .399029 .1679837 2.38 0.018 .0697869 .728271 | year | 2012 | .0038119 .0192994 0.20 0.843 -.0340142 .0416381 2013 | -.02782 .0471031 -0.59 0.555 -.1201403 .0645004 2014 | -.0348137 .0541059 -0.64 0.520 -.1408593 .0712318 2015 | .1384236 .0552425 2.51 0.012 .0301503 .246697 2016 | .0623224 .0595451 1.05 0.295 -.0543839 .1790286 2017 | .089855 .0592783 1.52 0.130 -.0263284 .2060383 | _cons | -7.467168 .8463628 -8.82 0.000 -9.126009 -5.808328 ---------------+---------------------------------------------------------------- sigma_u | 1.0303308 sigma_e | .34090766 rho | .90132614 (fraction of variance due to u_i) -------------------------------------------------------------------------------
Tags: Jeff Wooldridge Carlo L
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17724
#2

24 Sep 2019, 00:34

Junaid:
the differences arose beacuse of you perforemd two different models.
Missing vaklues cannot be the reason, as you have the same number of observations (1102) in both regression models.
As an aside, please note that the easiest way to share what you tyoed and what Stata gave you back (as you laudably did) is via CODE delimiters:

Code:

#toggle available from the Advanced editor bar

Kind regards,
Carlo
(Stata 19.0)
Comment
Junaid Ahmed

Join Date: Apr 2019

Posts: 39
#3

24 Sep 2019, 01:22

Dear Carlo Lazzaro,

Thanks for it. However, in principles, the coefficient must be same for mundlak and fixed-effect model. As in the gravity model, we are always interested to see the effect of time-invariant variables like distance and common language, so in comparing with FE, we will also use the Mundlak effect model. Please guide.

Last edited by Junaid Ahmed; 24 Sep 2019, 01:24.
Comment
Eric de Souza

Join Date: Mar 2014

Posts: 587
#4

24 Sep 2019, 01:34

Please provide the code for how you calculated the means.
Comment
Junaid Ahmed

Join Date: Apr 2019

Posts: 39
#5

24 Sep 2019, 01:49

Dear Eric de Souza

bysort home: egen mean_lgdpc=mean(lgdpc)
bysort host: egen mean_gdpc_lhos=mean(lgdpc_hos)
bysort host home: egen mean_lcost2=mean(lcost2)
bysort host home: egen mean_lbtrade=mean(lbtrade)
bysort host home: egen mean_lmig_st=mean(lmig_st)

lgdpc is home specific variable, meaning the recipient of remittances, lgdpc_hos is s host-specific variable, meaning sending country and other variables remittances cost, trade and migrant stock are the bilateral types of variables.
Comment
Eric de Souza

Join Date: Mar 2014

Posts: 587
#6

24 Sep 2019, 02:15

I think that the difference is arising from the way you calculate the means. The first two egens have a different bysort compared with the last three. I see no other reason.
I have always used the egen command for the Mundlak or CRE model with "non-gravity data in the following way:
. egen experbar = mean(exper), by(nr)
. egen unionbar = mean(union), by(nr)
. egen marriedbar = mean(married), by(nr)

I will be away the rest of the day.
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2188
#7

24 Sep 2019, 02:53

It’s almost certainly due to missing data. I explain this in my 2019 Journal of Econometrics paper. You should only use the complete cases when generating the time averages. Also, the time averages of the year dummies must be included in the unbalanced case.
Comment
Junaid Ahmed

Join Date: Apr 2019

Posts: 39
#8

24 Sep 2019, 03:15

Dear Eric de Souza,

The first two is basically the country-specific variable for both home and host, and the last three basically deal with bilateral data, that is why sorted by both host and home.
Comment
Junaid Ahmed

Join Date: Apr 2019

Posts: 39
#9

24 Sep 2019, 03:21

Dear Jeff Wooldridge,

so you mean dropped the countries with missing information, Do you not think, it will make problems. Also, could you please elaborate "You should only use the complete cases when generating the time averages. Also, the time averages of the year dummies must be included in the unbalanced case". What I understood, that we could also use the average for year dummies when working with unbalanced data. Could you please provide a Stata command for generating the time averages of year dummies.

Thanks
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2188
#10

24 Sep 2019, 05:23

No, you don't drop the country. You drop any observation (indexed by country and time) where any of the variables -- dependent variable or explanatory variable -- is missing. This is what fixed effects -- or any Stata command -- does. In other words, you must use the same time periods when you compute the time averages, even if you have different time periods available for some variables. It helps to start by creating a selection indicator that is one if and only if you have a complete set of cases. Then, use it in egen.

Code:

xtset id year gen s = (y != .) & (x1 != .) & ... & (xK != .) egen x1bar = mean(x1) if s, by(id) egen x2bar = mean(x2) if s, by(id) egen xKbar = mean(xK) if s, by(id) egen year2bar = mean(year2) if s, by(id) egen yearTbar = mean(yearT) if s, by(id) xtreg y x1 ... xK year2 ... yearT x1bar ... xKbar year2bar ... yearTbar, re vce(cluster id)
1 like
Comment
Eric de Souza

Join Date: Mar 2014

Posts: 587
#11

24 Sep 2019, 07:39

@Jeff Wooldridge.
I had completely forgotten about your paper. I do have a copy of the 2010 version.
Comment
Junaid Ahmed

Join Date: Apr 2019

Posts: 39
#12

25 Sep 2019, 01:09

Jeff Wooldridge , Thanks a lot, it perfectly works.
Junaid
Comment
Phil Bromiley

Join Date: Apr 2014

Posts: 4348
#13

25 Sep 2019, 08:26

Junaid - you might look at xthybrid and the Mundlak estimator that do this automatically.
1 like
Comment

Announcement

Why Mundlak and Fixed effect regression coefficient are not exactly same

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment