PPML, panel data - Statalist

Dilshat Obul

Join Date: Jun 2017

Posts: 8
#166

06 Nov 2017, 05:49

Dear Joao

thank you, l will look at the related literature.

Best regards,
Dilshat
Comment
Jan Skreb

Join Date: Oct 2017

Posts: 12
#167

07 Nov 2017, 11:35

Hello everyone,
I have a question regarding specification in Stata using PPML.
I am writing my master thesis using the gravity model of trade and since I using this model (and Stata) for the first time, I encountered some problems.
I tried to provide a sample of my data using -dataex- but i get error r(1000) and message "input statement exceeds linesize limit. Try specifying fewer variables" so I have to shortly explain it here. (I have Stata 12 SE.)
I have a balanced panel data set of 153 exporter and partner countries over 21 years (1995-2015). I am trying to look at the effect of an RTA on trade using the classic dummy variable approach.
I am trying to estimate an equation using country-pair fixed effects and country time-varying fixed effects (as recommended by "An Advanced Guide to Trade Policy Analysis: The Structural Gravity Model by Larch, Monteiro, Piermartini and Yotov, 2016).
Following the aforementioned source, I would estimate this equation with PPML (and OLS) using the command:
-ppml tradevalue lgdp_exporter lgdp_partner ldist contig comlang_off colony comcol landlockedex landlockedpar rta cafta onein_cafta cri_cafta dom_cafta gtm_cafta hnd_cafta nic_cafta slv_cafta usa_cafta lxr_ex lxr_par lpop_ex lpop_par pair_* exportertime_* partnertime_*, cluster(dist)-
where "pair_*" would be country-pair dummies and "exportertime_"/partnertime_*" would be country time-varying dummies.
However, when I try to create the country-pair dummies using the command -quietly tab pairid, gen(pair_)- [pairid is a panel identifier based on country pairs] and include them in the regression as state above, I get the error 134 "too many values".
I increased both the matsize and the maxvar to maximum (even though the country pair variables would give 153*152 = 23256 new variables, within limits of Stata12 SE) but I still get the same error.
I could not find the answer to this problem on the forum.
Does someone perhaps have a suggestion on how to solve the issue and create the dummies?
Thanks a lot!
Kind regards,
Jan
Comment
Tom Zylkin

Join Date: Nov 2016

Posts: 188
#168

07 Nov 2017, 12:13

Hi Jan,
For the issue with the number of fixed effects there are various commands that can help you get around the data size limit you are encountering, depending on which types of fixed effects you would like to use. For the case it sounds like you are describing with exporter-time, importer-time, and exporter-importer ("pair") fixed effects, there is a command I have written that specifically works well with this specification called "ppml_panel_sg".

To install ppml_panel_sg in Stata, you can install directly from ssc by typing "ssc install ppml_panel_sg, replace". You can find a help file and some example .do files on my website: www.tomzylkin.com. There is also a companion paper by Larch, Wanner, Yotov, and Zylkin (2017) I would appreciate you cite if you use the command.

Lastly, note that with the exporter-time and partner-time FEs, you will not be able to estimate the effects of GDPs or populations. Hope all this is helpful!

Regards,
Tom
1 like
Comment
Jan Skreb

Join Date: Oct 2017

Posts: 12
#169

08 Nov 2017, 04:57

Dear Mr. Zylkin,
thank You very much for Your reply. If the command works, I will definitely properly cite it.
I have run the regression and it is still iterating. However, since it might take quite a long time, I would like to ask You for help since there is a slight problem.
I used this command -ppml_panel_sg tradevalue lgdp_exporter lgdp_partner ldist contig comlang_off colony comcol landlockedex landlockedpar rta cafta onein_cafta cri_cafta dom_cafta gtm_cafta hnd_cafta nic_cafta slv_cafta usa_cafta lxr_ex lxr_par lpop_ex lpop_par, ex(exporter) im(partner) y(year) sym robust cluster(pairid)-
I am trying to estimate the effect of the CAFTA trade agreement on individual countries' trade and I have therefore constructed dummies for each member country. (1 if country of interest is exporter and another CAFTA member is partner). However, along with gdp variables, distance and all other "gravity variables" (whose results I am not directly interested in), three of the individual country dummies have been omitted and the results for these variables are important for my research.
The notes I get are "note: usa_cafta omitted because of collinearity over lhs>0 (creates possible existence issue)".
Do You perhaps have an idea of why this happened and how I can fix it?
Thank You very much for Your help.
Kind regards,
Jan
Comment
Tom Zylkin

Join Date: Nov 2016

Posts: 188
#170

08 Nov 2017, 05:55

Hi Jan,
Note that your individual CAFTA dummies for each CAFTA member "span" all CAFTA-related trade flows. The other variables you have for cafta as a whole and cafta_one_in (if I am interpreting it correctly) cannot be identified. As a result, the command is going to drop some of your cafta variables, starting from right to left (which is why it drops usa_cafta). Try without these variables and let me know if it fixes your problem.

Another suggestion: it looks like your CAFTA variables are intended to be asymmetric, in that "slv_cafta" (for example) refers only to el salvador's exports to other CAFTA countries, rather than its imports from CAFTA countries. In that case, I do not recommend using the "sym" option, as this is really only appopriate for identification if all main variables are symmetric with respect to direction of trade.

Regards,
Tom
Comment
Jan Skreb

Join Date: Oct 2017

Posts: 12
#171

08 Nov 2017, 07:15

Dear Mr. Zylkin,
I have run the regression like You suggested. Using the same command as I wrote previously, but not adding the -sym- option, the same variables (the three individual cafta dummies) got omitted. Then I tried the same command (without the -sym-) and also did not include the general cafta dummies (cafta=if trade is between CAFTA members and onein_cafta=if only one country in pair is CAFTA member). This was better since only the usa_cafta dummy was omitted. (however ideally I would like estimates for all 7 countries given my research). I tried to run the same regression as I wrote in the previous post, but instead of using -sym- I used -trend- and the same three cafta dummies get omitted.
Therefore, for further attempts I will not use the -sym- option. However do You perhaps have any other suggestions on how to estimate effects for both of the general cafta dummies and the 7 individual cafta dummies?
Thank You for Your help!
Kind regards,
Jan
Comment
Tom Zylkin

Join Date: Nov 2016

Posts: 188
#172

08 Nov 2017, 08:12

Hi Jan,
It is interesting that the usa_cafta dummy is still omitted. That does not sound right but it is hard to say without a close look at your data to see where the collinearity is coming from. Try running the following diagnostics:

Code:

reghdfe usa_cafta rta cri_cafta dom_cafta gtm_cafta hnd_cafta nic_cafta slv_cafta , noabsorb reghdfe usa_cafta rta cri_cafta dom_cafta gtm_cafta hnd_cafta nic_cafta slv_cafta if tradevalue>0 , noabsorb reghdfe usa_cafta rta cri_cafta dom_cafta gtm_cafta hnd_cafta nic_cafta slv_cafta , absorb(exporter#year partner#year exporter#partner) reghdfe usa_cafta rta cri_cafta dom_cafta gtm_cafta hnd_cafta nic_cafta slv_cafta if tradevalue>0, absorb(exporter#year partner#year exporter#partner)

Are the R^2's of any of the above regressions equal to 1? If so, usa_cafta is perfectly predicted by your other dummy variables for some reason.

Regards,
Tom

Last edited by Tom Zylkin; 08 Nov 2017, 08:14.
Comment
Jan Skreb

Join Date: Oct 2017

Posts: 12
#173

08 Nov 2017, 08:19

Dear Mr. Zylkin,
I tried to run the regressions You suggested.
Unfortunately, for the first two I get error 198: "option absorb() required" and for the second two I get error 109: "exporter: string variables may not be used as factor variables".
Do You have any other suggestions on how to find out if usa_cafta is perfectly predicted by your other dummy variables?
Thank You for Your help regardless!
Kind regards,
Jan
Comment

Tom Zylkin

Join Date: Nov 2016
Posts: 188

#174

08 Nov 2017, 08:26

Hi Jan,
My mistake. Try the following:

Code:

gen constant = 1
egen exp_id = group(exporter)
egen imp_id = group(partner)

reghdfe usa_cafta rta cri_cafta dom_cafta gtm_cafta hnd_cafta nic_cafta slv_cafta , absorb(constant)

reghdfe usa_cafta rta  cri_cafta dom_cafta gtm_cafta hnd_cafta nic_cafta slv_cafta if tradevalue>0 , absorb(constant)

reghdfe usa_cafta rta cri_cafta dom_cafta gtm_cafta hnd_cafta nic_cafta slv_cafta , absorb(exp_id#year imp_id#year exp_id#imp_id)

reghdfe usa_cafta rta  cri_cafta dom_cafta gtm_cafta hnd_cafta nic_cafta slv_cafta if tradevalue>0, absorb(exp_id#year imp_id#year exp_id#imp_id)

Comment

Jan Skreb

Join Date: Oct 2017

Posts: 12
#175

08 Nov 2017, 08:37

Dear Mr. Zylkin,
no, R^2 was not equal to 1 in any of the regressions. For the first two it was 0.0007 and for the second two it was 0.5014.
Comment
Tom Zylkin

Join Date: Nov 2016

Posts: 188
#176

08 Nov 2017, 09:43

Hi Jan,
OK, that's interesting. Feel free to email me at [email protected]. That may make it easier to get on the same page and figure out what's going on.
Regards,
Tom
Comment
Jan Skreb

Join Date: Oct 2017

Posts: 12
#177

08 Nov 2017, 09:45

Dear Mr. Zylkin,
thank You very much. I will e-mail You.
Kind regards,
Jan
Comment
Cengiz Tunc

Join Date: Nov 2017

Posts: 5
#178

29 Nov 2017, 01:12

Hi

I have a question about the ppml. I have data for 17 years at H2 industry level for exporters with known destinations. I have no zero values for the export data. I have estimated my model with the following forms but they give me different results. Can anyone tell me why I get these different results. In addition, the coefficients of the GPPs are not smaller than 1.

xi: ppml export_value ln_gdp_origin ln_gdp_dest ln_rcpi ln_exc_rate ln_vol_a av_bil_tw_vol_a i.origin_id i.dest_id i.h2 i.y, cluster(panelid)

---------------------------------------------------------------------------------
| Robust
export_value | Coef. Std. Err. z P>|z| [95% Conf. Interval]
----------------+----------------------------------------------------------------
ln_gdp_origin | 1.536049 .2289425 6.71 0.000 1.08733 1.984768
ln_gdp_dest | 2.407872 .2354777 10.23 0.000 1.946344 2.8694
ln_rcpi | .0639252 .0911668 0.70 0.483 -.1147584 .2426088
ln_exc_rate | .1379881 .0539549 2.56 0.011 .0322386 .2437377
ln_vol_a | -.2204741 .0687946 -3.20 0.001 -.355309 -.0856393
av_bil_tw_vol_a | -.080184 .0201617 -3.98 0.000 -.1197002 -.0406678

xtpoisson export_value ln_gdp_origin ln_gdp_dest ln_rcpi ln_exc_rate ln_vol_a av_bil_tw_vol_a, fe vce(robust)

| Robust
export_value | Coef. Std. Err. z P>|z| [95% Conf. Interval]
----------------+----------------------------------------------------------------
ln_gdp_origin | 1.445177 .1493351 9.68 0.000 1.152486 1.737869
ln_gdp_dest | 1.731971 .1290483 13.42 0.000 1.479041 1.984901
ln_rcpi | -.0035827 .0796969 -0.04 0.964 -.1597858 .1526205
ln_exc_rate | .0329676 .0250206 1.32 0.188 -.0160718 .082007
ln_vol_a | .0001654 .0057755 0.03 0.977 -.0111543 .0114852
av_bil_tw_vol_a | .0086021 .0059754 1.44 0.150 -.0031094 .0203136

xi: xtpqml export_value ln_gdp_origin ln_gdp_dest ln_rcpi ln_exc_rate ln_vol_a av_bil_tw_vol_a, fe

export_value | Coef. Std. Err. z P>|z| [95% Conf. Interval]
----------------+----------------------------------------------------------------
export_value |
ln_gdp_origin | 1.445177 .0384129 37.62 0.000 1.36989 1.520465
ln_gdp_dest | 1.731971 .01654 104.71 0.000 1.699554 1.764389
ln_rcpi | -.0035827 .0353225 -0.10 0.919 -.0728134 .0656481
ln_exc_rate | .0329676 .0033537 9.83 0.000 .0263944 .0395408
ln_vol_a | .0001654 .0009509 0.17 0.862 -.0016982 .0020291
av_bil_tw_vol_a | .0086021 .0014416 5.97 0.000 .0057766 .0114276
Comment
Said Jafar

Join Date: Feb 2015

Posts: 109
#179

29 Nov 2017, 04:10

Dear Cengiz,

Foruma hoş geldin.

First, in commands starting with xt, default fixed effects become pair (bilateral) fixed effects, not individual (importer, exporter) fixed effects. So, results must be different.

Second, year and HS-2 digit level fixed effect are missing in the xtpoisson and xppqml versions.

Third, you are not clustering in the xppqml version. If you cluster and not include year and HS-2 digit level fixed effects, you would get results as same as in xtpoisson.

And, i also would like to say that your model is little bit weird. Your dependent variable is trade but i feel like most of the independent var are international finance related. We usually find those variables in Gravity models for FDI. Possibly your gdp coefficient will decrease if you account for important determinants of international trade, such as distance, border, common language, colony and so.

Best,
1 like
Comment
Cengiz Tunc

Join Date: Nov 2017

Posts: 5
#180

29 Nov 2017, 05:50

Thank you Dias for your response.

Hoş bulduk

I determined the panel settings

egen panelid = group(origin_id dest_id h2)

xtset panelid y

. xtset
panel variable: panelid (unbalanced)
time variable: y, 1997 to 2014, but with gaps
delta: 1 unit
Isn't the HS-2 digit is determined with the panel setting. Do I still need to explicitly write them.

Last edited by Cengiz Tunc; 29 Nov 2017, 06:44.
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment