PPML, panel data - Statalist

John Alice

Join Date: Feb 2019

Posts: 28
#286

28 Jul 2019, 07:52

Dear Mr Silva
Hello! When I use the ppml regression, stata suggests "dependent variable xij has negative values r (499); my explanatory variable is the amount of services bilateral trade in the country. I understand that the "negative values" should include the unreported trade volume between countries. So I deleted the unreported trade amount as missing data. As a result, the same error was prompted when running the ppml command. How do I understand "negative values"?
Many thanks!
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3027
#287

28 Jul 2019, 10:08

Dear John Alice,

The negative values are just that: negative values. I do not see how you can have negative values with trade data; please check your data.

Best wishes,

Joao
Comment
JIA JIA LIN

Join Date: Jun 2019

Posts: 16
#288

30 Jul 2019, 00:11

Good moring Mr. Santos Silva,
Sorry to bother you, I just have a question about PPML.
I used gravity model to research the internal migration in China, there is no zero flows. comparing the standard poisson model, the advantage of PPML method is that the independent varaible don't need to be poisson distrubutian, and PPML could deal with the Heteroscedasticity but the standard poisson model cannot do these things, it is correct?

with many thanks and regards,
JIA JIA LIN
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3027
#289

30 Jul 2019, 01:19

Dear JIA JIA LIN,

PPML is just Poisson with valid standard errors; the estimates are exactly the same.

Best wishes,

Joao
Comment
JIA JIA LIN

Join Date: Jun 2019

Posts: 16
#290

30 Jul 2019, 01:49

Dear Mr. Santos Silva, * * I am a little confused about that the standard errors of ppml is always larger than the ols and normal poisson model, why the larger standard error is valid in the ppml? with many thanks and regards, JIA JIA LIN
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3027
#291

30 Jul 2019, 03:09

The other standard errors are too optimistic and therefore invalid.

Joao
1 like
Comment
JIA JIA LIN

Join Date: Jun 2019

Posts: 16
#292

30 Jul 2019, 03:53

Dear Mr. Santos Silva,

I am really grateful you answer my questions and it is really helpful to my dissertation.

with many thanks and regards,
LIN
Comment
JIA JIA LIN

Join Date: Jun 2019

Posts: 16
#293

01 Aug 2019, 04:03

Good morning Mr. Santos Silva,

sorry to bother you again, and I have some new question about PPML.
I use gravity model to analyzed migration and i don't have zero flows. I find that if i add cluster for the regression， the result of POISSON and PPML keep the same and the standard error is also keep the same.

the stata commands are :
1. poisson y lnx1 lnx2 lnx3 lnx4 lnx5 ......, cluster (lnx3) where y is the number of migrants, x1 and x2 is population ar origin and destination, x3 is distance, x4 and x5 et al are the other dependent variables.

2. ppml y lnx1 lnx2 lnx3 lnx4 lnx5 ......, cluster (lnx3)

i get the same results from two commands, is it means if i don't have zero flow, the ppml is consistent with poisson? then It doesn't make sense to use ppml, is it correct?

with many thanks and regrads,
LIN
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3027
#294

01 Aug 2019, 04:12

Dear JIA JIA LIN,

Poisson with robust or clustered standard errors is PPML; the ppml command has some advantages over the standard poisson command, but in most cases they lead to the same results if the same standard errors are used in both commands.

Best wishes,

Joao
Comment
JIA JIA LIN

Join Date: Jun 2019

Posts: 16
#295

01 Aug 2019, 04:24

Good morning Mr. Santos Silva,

so I also could choose ppml and i could write, ppml is better than poisson because i don not need to rocust or clusteres standard error ,is it correct? the teacher said the standard poisson model is fine ,but i really like ppml command and i want to use it in my dissertation, so these days i try to find the advantages of ppml over the standard poisson.

with many thanks and regards,
lin
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2207
#296

01 Aug 2019, 06:20

Lin: You seem to be missing the point that there is only one estimator, and that is the Poisson pseudo- (or quasi-) MLE. There are now different Stata commands that produce the same estimator. In addition to -poisson- and -ppml- you can throw in -glm- with the fam(poisson) option. Since it's the same estimator produced by -poisson-, -ppml-, and -glm-, it makes no sense to "prefer" one command to the other.

For a long time -- since 1984 in econometrics at least -- we have known that the Poisson pseudo-MLE is fully robust to distributional misspecification. It consistently estimate the parameters in the correctly specified mean -- in this case, an exponential conditional mean. It works for any kind of nonnegative outcome, even if you see no zeros.

The only issue is how one computes the standard errors of the one estimator. With recent versions of Stata, -poisson- allows both a vce(robust) option, essentially a necessity for cross-sectional data, and a vce(cluster id) option, essentially necessary for panel data and other clustered sampling structures. The precise calculations of the standard errors may differ across commands and packages. But those are implementation-specific details that produce, at most, minor differences. A few versions ago, the Stata -poisson- command insisted that the full Poisson assumptions held, and it did not allow robust and cluster options in computing standard errors. But that has changed.

Remember: There's only one estimator! The command you use to obtain the estimator and valid standard errors is irrelevant as long as you get them right.

JW
Comment
JIA JIA LIN

Join Date: Jun 2019

Posts: 16
#297

01 Aug 2019, 07:37

Dear Jeff Wooldridge,
I am really grateful for your replying and it is really clear to me.

with many thanks and regards,
LIN
Comment
JIA JIA LIN

Join Date: Jun 2019

Posts: 16
#298

03 Aug 2019, 23:32

Good morning Mr. Santos Silva,

when I read the paper from yours or other authors, I find that there are less litertures to deal with the potential endogenous and robust check, maybe the reason is that suitable instruments are often difficult to find and it is common practice to use ‘deep lagging’ of right‐hand side variables as a statistically acceptable practice in cross‐section and panel models of gross migration flows. in this position, could I directly ignore the potential endogenous and robust check for the migration research in my paper?

with many thanks and regards,
LIN
Comment
Jorrit Gosens

Join Date: Jan 2015

Posts: 1019
#299

09 Aug 2019, 05:00

I have a further question on the use of dummies for fixed effects.
Different papers i have seen do a combination or selection of dummies for years, countries, or country pairs.
Somewhat earlier in this thread there was a reference to Richard Baldwin's writing on the topic, which I believe is this paper: https://www.nber.org/papers/w12516.pdf

That paper suggest the gold standard, for panel data, is to include both time-variant country dummies and time-invariant country pair effects.
Although I won't say I entirely understand the statistical evidence for it, I am a bit baffled by this advice. I'm using global trade data, as is fairly standard, with circa 225 exporters and importers.
With 10 years of data, time-variant dummies for importers and exporters would 2*225*10 = 4,500 dummies. That's a bit silly, but with a decent machine, and a fair amount of patience, regressions with such amounts of variables will finish running at some point.
But including country-pair dummies? That would be 225*225=50,625 further variables. Apart from memory issues with even creating such a dataset, I'm quite doubtful regressions would finish running on such a set even if you have some sort of big server to run this on?

My current approach is to leave out those country-pair effects and make a note in my paper that including country-pair effects would be considered gold-standard (Baldwin 2006), but this is well beyond the practical capabilities of most machines, and isn't going to impact causal inference.
Would that be a correct/sensible approach? Or am I misunderstanding the suggestion Baldwin makes about dummies?
Comment
Tom Zylkin

Join Date: Nov 2016

Posts: 188
#300

09 Aug 2019, 06:49

Hi Jorrit,
To the practical aspects of your question, please read this paper by myself, Marlo Larch, Joschka Wanner, and Yoto Yotov that specifically addresses the type of computational problem you are referencing. These types of problems used to be commonplace in the literature (as well as on this forum), but thankfully are now largely resolved. If you want to estimate any of these specifications quickly and easily, there are now two different Stata commands you can use, ppml_panel_sg (the command we introduced along with the paper) and ppmlhdfe (a newer command that gives you more flexibility).

You also asked why we need all of these dummies. The answer does not have to do with statistical evidence; rather it has to do with omitted variable bias. A famous paper by Anderson and van Wincoop (2003) showed that "traditional" gravity estimates that only used simple covariates like GDP, distance, etc were biased because they did not account for how each country's trade is a function of how well-positioned it is geographically to trade with other countries - and each country's geographic factors are in turn a function of other country's geographic advantages. In the paper, these geographic parameters are given the name "multilateral resistances", a term you will see everywhere in the trade literature because failing to account for them leads to very badly biased estimates. In the wake of Anderson and van Wincoop's contribution, it is now standard to use country-time fixed effects to account for these multilateral resistance terms, as in the paper you linked by Richard Baldwin and Daria Taglioni.

There is also a practitioner's guide recently put out by the WTO that provides an excellent intro to these and other estimation issues associated with gravity estimation, as well as an excellent handbook chapter by Keith Head and Thierry Mayer that gives an overview of the history of how gravity estimation has evolved over time.

I hope this is helpful!

Regards,
Tom

Last edited by Tom Zylkin; 09 Aug 2019, 06:56.
1 like
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment