PPML,panel data

Jason hsu

Join Date: May 2015

Posts: 5
#1

PPML,panel data

09 Jul 2015, 08:36

Dear Statalist users,

I'm preparing my master thesis, with the objective to assess the impact of factor on internationl patent collaborations, with a dataset for 14 countries, along 23 years (panel). I would like to estimate it with ppml estimator but I do not know how to include ppml and panel data together.

I have run the following codes:

gen code=0
replace code=1 if country=="CA"
replace code=2 if country=="CN"
replace code=3 if country=="DE"
replace code=4 if country=="GB"
replace code=5 if country=="HK"
replace code=6 if country=="IL"
replace code=7 if country=="IN"
replace code=8 if country=="JP"
replace code=9 if country=="KR"
replace code=10 if country=="MY"
replace code=11 if country=="NL"
replace code=12 if country=="SG"
replace code=13 if country=="TH"
replace code=14 if country=="US"
xtset code year

(1) ppml dep.var. indep.var. year dummies. country dummies.
(2) ppml dep.var. indep.var. year dummies.
(3) ppml dep.var. indep.var.

I’m wondering whether the (1) ,(2) and (3) is correct, or maybe there have other code.

Thank you all in advance.
Best regards,
Jason Hsu
Attached Files

panel data.xlsx (54.2 KB, 1 view)
Tags: None
Joao Santos Silva

Join Date: Apr 2014

Posts: 2779
#2

09 Jul 2015, 12:06

Dear Jason,

First of all, thanks for using -ppml-

What you are doing is correct, but in the first case you can speed up the estimation by using

Code:

xtset country xtpoisson dep.var. indep.var. year dummies, fe

All the best,

Joao
Comment
Jason hsu

Join Date: May 2015

Posts: 5
#3

10 Jul 2015, 12:23

Dear João,

Thank you very much for the reply

I have another question.

I have use panel data and run the following codes:

xtset country year
(1) ppml dep.var. indep.var
(2) xtpoisson dep.var indep.var

Results of regressions (1) and regressions (2) are not the same.

How to account for panel data with –ppml-

Thank you!
All the best,
Jason Hsu.
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 2779
#4

10 Jul 2015, 12:52

Dear Jason,

The commands that should produce the same results are as follows:

Code:

xi:ppml dep.var. indep.var i.country

and

Code:

xtset country year xtpoisson dep.var indep.var

All the best,

Joao
Comment
Jason hsu

Join Date: May 2015

Posts: 5
#5

11 Jul 2015, 10:16

Dear João,

Thank you for your time and attention.

I have run the following two codes:

Code:
xi:ppml dep.var. indep.var i.country

and

Code:
xtset country year
xtpoisson dep.var indep.var

I encountered some problems.

My variables were dropped and omitted.

How do i deal with it?

My best regards
Jason Hsu.
Attached Files

Result.xlsx (55.5 KB, 1 view)
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 1895
#6

11 Jul 2015, 10:38

Jason: Probably some of your variables don't vary over time for any country, and so Stata decides arbitrarily to drop some variables -- in this case, some of the year dummies. This is why you should use xtpoisson with the fe option. Then you will know for sure some variables have no time variation because they will be dropped. xtpoisson uses a kind of within transformation rather than estimating coefficients on dummmies.

Also, you should use a robust variance matrix:

xtpoisson y x1 ... xK, fe vce(robust)

With small N (N = 14) this estimator is difficult to justify, but it's better than assuming the Poisson distributional assumption holds and that you don't have serial correlation. Why do you insist on using ppml when the xtpoisson command now does what you want?

As a final comment, I wouldn't believe most results unless you include a full set of year dummies:

xtpoisson y x1 ... xK i.year, fe vce(robust)

JW
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 2779
#7

11 Jul 2015, 11:03

Sorry, Jason, in my second post I forgot the -fe- option in the -xtpoisson- command. If you do this, -ppml- and -xtpoisson- will give you exactly the same estimates (but not the same standard errors, more on this below).

About the dropped variables, if you include the country dummies, or country fixed effects, variables that vary only by country (such as distance) will be dropped.

Finally, on Jeff's question about why it may be interesting to use -ppml- instead of -xtpoisson, the following example may help to clarify the usefulness of -ppml-

Code:

use http://privatewww.essex.ac.uk/~jmcss/mock xi: ppml y x z i.w xtset w xtpoisson y x z, fe

This example illustrates three points:

a) The estimates of the coefficient on x are the same; this is as expected;

b) -xtpoisson- does not recognize that the coefficient on z is not identified and tries to estimate it; -ppml- correctly drops that regressor and the observations with z==1;

c) As far as I understand, -xtpoisson- with robust standard errors always clusters by the variable defining the panel, which may not make sense (as is the case here). That is, -xtpoisson- does not allow you to compute the usual robust standard errors, which are the default in -ppml-.

So, although it is computationally more expensive, there are cases where -ppml- is preferable to -xtpoisson-, even in situations where in principle the results should be the same.

All the best,

Joao

Last edited by Joao Santos Silva; 11 Jul 2015, 11:53. Reason: Included example and extended discussion.
Comment
Jason hsu

Join Date: May 2015

Posts: 5
#8

12 Jul 2015, 15:08

To:Jeff

Thank you very much for the reply.

I will try your suggestions.

Why do I insist on using ppml.

I am wondering results of –ppml- and –xtpoisson- the same in panel data ?

To: João

Again, thank you for your time and attention.

The example helps me to clarify the usefulness of –ppml-

Best regards,
Jason Hsu.
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 2779
#9

14 Oct 2015, 14:13

PS: if someone tries to replicate the example in #7, please note that the data is now available here: http://personal.lse.ac.uk/tenreyro/mock.dta
Comment
Dewmi Bandara

Join Date: Oct 2016

Posts: 1
#10

29 Oct 2016, 02:45

hello everyone,
I used this xi:ppml dep V inde V i.country i.year for my analysis but it states that "varlist required" . why is that?
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 2779
#11

29 Oct 2016, 11:08

Dear Dewmi,

Please show us exactly what you typed.

Joao
Comment
Ann Ng

Join Date: Apr 2017

Posts: 25
#12

09 Apr 2017, 18:54

Hello Joao,
Thank you very much for your kind reponses in this forum. I have learnt a lot from this topic http://www.statalist.org/forums/foru...pml-panel-data. But, I still have some more questions with my own data. I hope to receive answer from you as well as other stata experts here.
I analyze determinants of bilateral FDI between 40 countries over 11 years. FDI flows for pair A to B is different from B to A. I have 40x 39 = 1560 pairs. My data have a lots of zeros and thus I want to use PPML estimator. Here is what I did:
* panel data:
xtset pair year
* ppml estimation
xi: ppml FDI indvars i.year i.pair, cluster(dist) // (1) This code doest not work as calculation is over the matsize for my stata IC version. But it works if I drop the i.pair dummies:
xi: ppml FDI indvars i.year, cluster(dist) // (2)
My questions are as follows:
1. Should I include the dummies for pairs of countries in the model? The dummy for AB is different from the one for BA
2. If I want to include the dummies for pairs and my STATA does not work with (1), is it true that the following equation does the same:
xtpoisson realstock $list2 gdpdif year*, fe // but in this case, the cluster variable is not distance but pair ? (3)
3. I am confused by the robust option. If I add 'robust' to equation (3), all variable of interests become statistically insignificant. Meanwhile, with ppml the robust standard errors are the default (and this is why robust option is not allowed with ppml?), most of my independent variables are significant as I want. (but of course this is ppml without dummies for pairs). Could you pls give me some advice of whether or not use the robust option here?
4. Between (1), (2), (3), which one do you recommend me to use for my estimation? I use RESET test for all and p-value for (2) is 0.001, and for (3) with robust is 0.07 . What should I do in this situation?
Thank you very much for your time and I hope to hear from you soon!
Best regards,
Ann

Last edited by Ann Ng; 09 Apr 2017, 19:07.
Comment
Helen Makrin

Join Date: Apr 2017

Posts: 1
#13

10 Apr 2017, 07:04

Dear Joao,

how can I run the RESET test in a panel poisson pseudo-maximum likelihood model? The estat ovtest command does not run.

Best regards
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 2779
#14

10 Apr 2017, 12:14

Dear Helen Makrin,

I believe we describe how to do it in our webpage.

Best wishes,

Joao
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 2779
#15

10 Apr 2017, 12:27

Dear Ann Ng.

1 - Whether or not you need to include the pair fixed effects is a modeling question and depends on what you want to do, so only you can answer that question.

2 - Indeed, if you want to include the pair fixed effects you can use -xtpoisson- and cluster by pair.

3 - You always need to cluster by pair (or distance). As you say, by default -ppml- reports robust standard errors but these are not clustered by pair, so you need to explicitly use the clustering option.

4 - (1) and (3) should give you the same results; the choice between these and (2) is really up to you.

Best wishes,

Joao
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment