Hi all,
In a dataset with one observation per firm, year and destination, I try two different ways of predicting trade flows.
The data set has a lot of zeros. Therefore, I started by estimating a ppml:
Approach 1: PPML
An alternative approach is to only take strictly positive trade flows into account and estimate the model in logs. After this, I predict trade flows in levels:
Approach 2: Log-model
(The data set runs from 2005 to 2019.)
How do I get the right standard errors in the ppml-case, so that the two predictions become comparable? I would expect the same amount of predicted observations and a correlation between pred_exps1 and pred_exps2 of 1?
Best.
Kathrin
In a dataset with one observation per firm, year and destination, I try two different ways of predicting trade flows.
The data set has a lot of zeros. Therefore, I started by estimating a ppml:
Approach 1: PPML
Code:
ppmlhdfe exports log(distance) ... , vce(r) d absorb(industry firm_id) predict pred_exps1 if year == 2005, mu
Approach 2: Log-model
Code:
reghdfe log_exports log(distance) ... , vce(r) absorb(industry firm_id) resid predict pred_exps2_biased if year == 2005, xbd gen pred_exps2 = exp(pred_exps2_biased) * exp(0.5 * e(rmse)^2) if year == 2005
How do I get the right standard errors in the ppml-case, so that the two predictions become comparable? I would expect the same amount of predicted observations and a correlation between pred_exps1 and pred_exps2 of 1?
Best.
Kathrin
Comment