I have some questions regarding a ppml estimation of a gravity model of trade.
The data-set contains nearly 290.000 bilateral observations over 50 years. The data-set used is provided by Rose (2005):
http://faculty.haas.berkeley.edu/arose/RecRes.htm
Unfortunately, I have just the log values for most of the variables. Thus, the advantage of “ppml” regarding the treatment of zero values might disappear. However, I am strongly concerned about the heteroscedasticity in the data.
Reading the paper by Silva and Tenreyro 2006 (The log of gravity: http://personal.lse.ac.uk/tenreyro/LGW.html) was an eye-opener for me. As an undergraduate I have to admit that the implementation for me in Stata appears to be a little tricky. It would be really interesting for me if the results of the paper might change when I apply the Pseudo-Poisson Maximum Likelihood estimator. I use Stata 13.
The description of the variables:
First, I transformed the variables back with exp().
Second, I re-scaled them because of the warnings by the first regression.
gen trade_0 = exp(ltrade)/(1e12)
gen trade1to2_0 = exp(ltrade1to2)/(1e12)
gen trade2to1_0 = exp(ltrade2to1)/(1e12)
gen dist_0 = exp(ldist)/(1e12)
gen rgdp_0 = exp(lrgdp)/(1e12)
gen rgdppc_0 = exp(lrgdppc)/(1e12)
gen amount_0 = exp(amount)/(1e12)
Third, I generated the dummy variables and eliminated the xi: after I worked through this: http://www.statalist.org/forums/foru...rgence-problem
gen island_0=1 if island==0
replace island_0=0 if island_0==.
gen island_1=1 if island==1
replace island_1=0 if island_1==.
gen island_2=1 if island==2
replace island_2=0 if island_2==.
gen landl_0=1 if landl==0
replace landl_0=0 if landl_0==.
gen landl_1=1 if landl==1
replace landl_1=0 if landl_1==.
gen landl_2=1 if landl==2
replace landl_2=0 if landl_2==.
gen imf_none=1 if imf==0
replace imf_none=0 if imf_none==.
gen imf_one=1 if imf==1
replace imf_one=0 if imf_one==.
gen imf_both=1 if imf==2
replace imf_both=0 if imf_both==.
Using:
ppml trade paris amount custrict dist comlang border regional ///
rgdp rgdppc comcol curcol colony comctry island_0 island_1 ///
island_2 landl_0 landl_1 landl_2 imf_none imf_one imf_both, cluster(pairid)
I get this results:
I am worried about the strange estimator for dist_0.
Without the re-scaling using:
ppml trade paris amount custrict dist comlang border regional ///
rgdp rgdppc comcol curcol colony comctry island_0 island_1 ///
island_2 landl_0 landl_1 landl_2 imf_none imf_one imf_both, cluster(pairid)
I get this results (which are reasonable for me) :
Any help would be appreciated
Edit: Trying to get this outputs more readable, so far I attached pictures.
The data-set contains nearly 290.000 bilateral observations over 50 years. The data-set used is provided by Rose (2005):
http://faculty.haas.berkeley.edu/arose/RecRes.htm
Unfortunately, I have just the log values for most of the variables. Thus, the advantage of “ppml” regarding the treatment of zero values might disappear. However, I am strongly concerned about the heteroscedasticity in the data.
Reading the paper by Silva and Tenreyro 2006 (The log of gravity: http://personal.lse.ac.uk/tenreyro/LGW.html) was an eye-opener for me. As an undergraduate I have to admit that the implementation for me in Stata appears to be a little tricky. It would be really interesting for me if the results of the paper might change when I apply the Pseudo-Poisson Maximum Likelihood estimator. I use Stata 13.
The description of the variables:
. | * Summary of | the dataset | ||||
. | sum | |||||
Variable | Obs | Mean | Std. Dev. | Min | Max | |
cty1 | 219573 | 292.6153 | 186.4372 | 111 | 964 | |
cty2 | 219573 | 565.7396 | 220.612 | 112 | 968 | |
year | 219573 | 1979.758 | 11.98733 | 1948 | 1997 | |
ctyname1 | 0 | |||||
ctyname2 | 0 | |||||
pairid | 219573 | 11150.04 | 8554.216 | 765 | 32585 | |
ltrade | 219573 | 14.64697 | 3.35878 | -11.4853 | 25.31005 | |
ltrade1to2 | 192720 | 14.80027 | 3.36609 | -16.47211 | 25.19833 | |
ltrade2to1 | 182644 | 14.68049 | 3.482428 | -13.54052 | 25.41054 | |
ldist | 219573 | 8.167161 | .8075762 | 3.782556 | 9.421514 | |
lrgdp | 219573 | 47.85111 | 2.665963 | 35.3876 | 58.01698 | |
lrgdppc | 219573 | 16.03824 | 1.449853 | 10.1211 | 20.89841 | |
regional | 219573 | .012292 | .1101862 | 0 | 1 | |
border | 219573 | .0308371 | .1728766 | 0 | 1 | |
comlang | 219573 | .2266627 | .4186735 | 0 | 1 | |
comcol | 219573 | .1015653 | .3020765 | 0 | 1 | |
comctry | 219573 | .0003051 | .0174656 | 0 | 1 | |
colony | 219573 | .0209953 | .1433687 | 0 | 1 | |
curcol | 219573 | .0020494 | .0452243 | 0 | 1 | |
custrict | 219573 | .0144326 | .1192658 | 0 | 1 | |
landl | 219573 | .2388955 | .4596647 | 0 | 2 | |
island | 219573 | .3444595 | .5413812 | 0 | 2 | |
lareap | 219573 | 24.21759 | 3.289929 | 9.638662 | 32.19601 | |
amount | 6775 | 1050.271 | 2834.907 | 4 | 29871 | |
defby1 | 6775 | .0727675 | .2597737 | 0 | 1 | |
paris | 219573 | .0075009 | .0862826 | 0 | 1 | |
imf | 219573 | .2911332 | .5029937 | 0 | 2 | |
Second, I re-scaled them because of the warnings by the first regression.
gen trade_0 = exp(ltrade)/(1e12)
gen trade1to2_0 = exp(ltrade1to2)/(1e12)
gen trade2to1_0 = exp(ltrade2to1)/(1e12)
gen dist_0 = exp(ldist)/(1e12)
gen rgdp_0 = exp(lrgdp)/(1e12)
gen rgdppc_0 = exp(lrgdppc)/(1e12)
gen amount_0 = exp(amount)/(1e12)
Third, I generated the dummy variables and eliminated the xi: after I worked through this: http://www.statalist.org/forums/foru...rgence-problem
gen island_0=1 if island==0
replace island_0=0 if island_0==.
gen island_1=1 if island==1
replace island_1=0 if island_1==.
gen island_2=1 if island==2
replace island_2=0 if island_2==.
gen landl_0=1 if landl==0
replace landl_0=0 if landl_0==.
gen landl_1=1 if landl==1
replace landl_1=0 if landl_1==.
gen landl_2=1 if landl==2
replace landl_2=0 if landl_2==.
gen imf_none=1 if imf==0
replace imf_none=0 if imf_none==.
gen imf_one=1 if imf==1
replace imf_one=0 if imf_one==.
gen imf_both=1 if imf==2
replace imf_both=0 if imf_both==.
Using:
ppml trade paris amount custrict dist comlang border regional ///
rgdp rgdppc comcol curcol colony comctry island_0 island_1 ///
island_2 landl_0 landl_1 landl_2 imf_none imf_one imf_both, cluster(pairid)
I get this results:
(1) | |
trade_0 | |
paris | 0.957*** |
(12.67) | |
custrict | 0.133 |
(1.41) | |
dist_0 | -98447200.1*** |
(-17.40) | |
comlang | -0.0975* |
(-2.30) | |
border | 1.387*** |
(22.54) | |
regional | 1.656*** |
(25.14) | |
rgdp_0 | 4.84e-13*** |
(20.65) | |
rgdppc_0 | 8608.7*** |
(33.25) | |
comcol | -3.257*** |
(-10.38) | |
curcol | 0.237** |
(2.63) | |
colony | 0.991*** |
(17.76) | |
comctry | -1.445*** |
(-7.24) | |
island_0 | -0.00664 |
(-0.12) | |
island_2 | 0.0830 |
(0.90) | |
landl_0 | 0.918*** |
(22.49) | |
landl_2 | -0.384*** |
(-4.32) | |
imf_none | 1.419*** |
(13.43) | |
imf_one | 0.772*** |
(7.29) | |
_cons | -11.34*** |
(-91.22) | |
N | 219558 |
Without the re-scaling using:
ppml trade paris amount custrict dist comlang border regional ///
rgdp rgdppc comcol curcol colony comctry island_0 island_1 ///
island_2 landl_0 landl_1 landl_2 imf_none imf_one imf_both, cluster(pairid)
I get this results (which are reasonable for me) :
(1) | |
trade | |
paris | 1.246*** |
(12.85) | |
amount | 0.0000560*** |
(7.14) | |
custrict | 0.693*** |
(4.84) | |
dist | -0.0000687 |
(-1.84) | |
comlang | -0.0886 |
(-0.45) | |
border | 1.514*** |
(7.70) | |
regional | 1.593*** |
(4.36) | |
rgdp | 1.30e-24*** |
(11.94) | |
rgdppc | 1.33e-08*** |
(7.23) | |
comcol | -1.363** |
(-2.74) | |
colony | 1.240*** |
(4.08) | |
island_0 | -0.0737 |
(-0.35) | |
island_2 | 0.994 |
(1.28) | |
landl_0 | 2.802*** |
(12.87) | |
landl_1 | 1.788*** |
(7.92) | |
imf_one | -0.0568 |
(-1.10) | |
imf_both | -0.200 |
(-1.17) | |
_cons | 14.41*** |
(40.54) | |
N | 6760 |
Edit: Trying to get this outputs more readable, so far I attached pictures.
Comment