You are not logged in. You can browse but not post. Login or Register by clicking 'Login or Register' at the top-right of this page. For more information on Statalist, see the FAQ.
Thank you for sending the data. Those variables are dropped because other variables have missing values when they are equal to 1. So, after dropping the missings, those dummies have no variation. In a next update of -ppml- I'll try to find a way of providing more helpful warnings in these cases.
Best wishes,
Joao
Dear Prof Joao,
I'm having the same problem as Majid, Could you advise me on what should I do to include this dummy (BHP) as its an important variable for my study.
. ppml ExportUSDmil imp_time lGDPmj lGDPCmj ldist ler lhc CNTG BHP BMP, clu(DIST)
note: checking the existence of the estimates
WARNING: imp_time has very large values, consider rescaling or recentering
WARNING: lGDPmj has very large values, consider rescaling or recentering
WARNING: lGDPCmj has very large values, consider rescaling or recentering
note: starting ppml estimation
note: ExportUSDmil has noninteger values
Number of parameters: 9
Number of observations: 144
Number of observations dropped: 0
Pseudo log-likelihood: -22240.056
R-squared: .9268023
(Std. Err. adjusted for 18 clusters in DIST)
Robust
ExportUSDmil Coef. Std. Err. z P>z [95% Conf. Interval]
I follow many of your comment and advice on ppml estimation with gravity and thank for such a useful contribution.
Yet, I still have problem of excluded aggressors similar to Majid. Let me brief my study and problem and appreciate your advice:
I am conduct a research on the "impacts of human capital on value-added trade for East Asia economies". I employ gravity model to estimate the coefficients. I have 11 East Asia economies as exporters and 54 partner countries for 2005-2015 period. I regard it as panel data by xtset year. I use two estimation methods: OLS fixed effects and PPML fixed effect accounting for exporter-time and importer-time varying for both estimators. My dependent variable is value-added export (vae) and key independent variables are mean year of schooling (mys) and quality of education (edu_qual). Below are stata command for both estimation methods:
I conduct RESET test as suggested by your 'the Log of Gravity' page and the result favor ppml estimator.. The big problem, though, is that my key aggressors (mys and edu_qual) are excluded along with ln_infra. Below is the outputs from stata:
Number of regressors excluded to ensure that the estimates exist: 535
Excluded regressors: ln_year_schl ln_pisa_score ln_infra
I tried collinearity diagnosis and drop ln_infra in the estimation; yet the calculation still exclude my main aggressors (mys and edu_qual).
My questions are follows:
(1) is there any solution that can fixed this issue?
(2) do i use the appropriate fixed effects and right stata command for ppml?
As advised, i run the same specification with ppmlhdfe command as below:
ppmlhdfe dva ln_output_o ln_output_d ln_distw contig comlang_off fta_wto ln_year_schl ln_pisa_score ///
ln_inc_gap ln_tariff_face ln_infra ln_export_time, a(EXPORTER_TIME_FE* IMPORTER_TIME_FE*) cluster (pair_id)
There result turn out that a lot more of aggressors are excluded as per show below:
(warning: absorbing 704 dimensions of fixed effects; check that you really want that)
note: 7 variables omitted because of collinearity: ln_output_o ln_output_d ln_year_schl ln
> _pisa_score ln_tariff_face ln_infra ln_export_time
I still want to stick to this specification as it is explained reasonably by theory YET I relaxed the fixed effect a bit.
I changed exporter-time importer-time fixed effects to just exporter and importer fixed effect and run the following estimation and it works NO drop of aggressors.
My concern is I am not sure if accounting for only exporter importer fixed effects can capture the real effect or not.
Appreciate your comment and advice on this.
Those variables are dropped because they are collinear with the fixed effects. So, it is up to you to decide whether you include time-varying fixed effects and do not estimate the coefficients on those variables, or include just Exp and Imp fixed effects and estimate those coefficients.
The "absorbing 704 dimensions of fixed effects; check that you really want that" message suggests that maybe you are specifying the fixed effects in a way other than intended. It sounds like you have 704 variables in your data set that start with either 'EXPORTER_TIME_FE" or "IMPORTER_TIME_FE". But really, all you need here is two variables, one with a unique ID for each exporter-year and one with a unique ID for each importer-year.
Here is a simple example you may be able to follow:
egen exp_time = group(exporter year)
egen imp_time = group(importer year)
where "exporter" and "importer" should be replaced by whatever variables you are using to identify the exporter and importer. However, as Joao rightly says, the variables "ln_output_o" and "ln_output_d" look like they should be collinear with your exporter-time and importer-time fixed effects. If you instead want to have exporter and importer fixed effects only, you should be able to type
Again, this is assuming you have two numerical ID variables called "exporter" and "importer" that respectively identify the exporter and importer countries.
Thank you so much for precise elaboration and suggestion on my puzzle on top of Joao's advise.
I will closely follow your suggested command and will feedback later on the results.
I am currently completing my Master's dissertation on the impact of the reform of the rules of origin on EU imports from African EBA countries. I am using panel data, with imports from 34 African countries to the EU, over the course of 17 years - each country has around 26 000 products per year, because I am using imports at the HS6 level.
I am attempring to use the new command ppmlhdfe in order to conduct my robustness check, but I keep getting the following error:
my regression:
ppmlhdfe ln_imports after_reform, a(id)
my country fixed effect:
egen id=group(origin)
the error:
remove_collinears(): 3499 selectindex() not found
GLM::init_variables(): - function returned error
<istmt>: - function returned error
Apparently, the "specified variable or function could not be found".
Could someone please help me figure out this issue?
The error you're receiving is referring to a mata function called "selectindex" that has only been made available starting with Stata 13. If you are using an older version of Stata, you will need to follow the procedure outlined in this post.
Comment