Three dimensional panel data regression

Alberto Rosingana

Join Date: Apr 2014

Posts: 3
#1

Three dimensional panel data regression

22 Apr 2014, 03:46

Good Morning everybody,

I'm pretty new with Stata.

As I anticipated in the title I have a dataset with country, industry sector and year dimensions(i,j,t structure).

The dependent variable is Foreign direct investments flows into 5 different countries and 23 sectors for 5 years each.

Some of my independent variables, for example GDP growth, vary with country and year while others, like the value added in each sector vary with country, sector and year.

There must be some way to analyse such data making a fixed effect regression, but I am unaware of it. It seems thet you can put just one id variable but I have two.

Can anyone help me?

Thank you in advance for your time and availability.

Alberto
Tags: None
Clarice Martins

Join Date: Apr 2014

Posts: 22
#2

22 Apr 2014, 05:18

Hello, Alberto!

I am pretty new too with Stata, so I can't help you with you question.

But I would like to give you a couple of suggestions to structure better your question, so that the most experienced members can help you. Please consider:
- putting a sample of you data
- using -describe- and posting the results
- and, read this FAQ of the forum - http://www.statalist.org/forums/help - it has lots of tips.

Take care,
Clarice
2 likes
Comment
Alberto Rosingana

Join Date: Apr 2014

Posts: 3
#3

22 Apr 2014, 05:50

Thank you very much Clarence.
'll try to create a simple sample of the problem so that the question will be clearer.
Cheers
Alberto
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2464
#4

22 Apr 2014, 09:57

Hi Alberto,
There are many options for modeling such data using panel fixed effects.
unless there are strict restrictions in your analysis regarding the combination of industry and country fixed effects, one option is to "combine" the information of both, and set up your panel as follows:
egen cn_ind=group(country industry)
sort cn_ind year
xtset cn_ind year
and then just estimate your model using standard commands:
xtreg y x i.year, fe (For a model with time fixed effects and country/industry fixed effects)
If on the contrary you dont want to combine the country and industry fixed effect, you can use the following options:
xtset country
xtreg y x i.year i.industry, fe

or use some user written commands like -gpreg- (use findit gpreg)
gpreg y x i.year, i(country) j(industry)

HTH
Fernando
1 like
Comment
Alberto Rosingana

Join Date: Apr 2014

Posts: 3
#5

25 Apr 2014, 14:14

Thank you very much Fernando! they seem to work properly!

Best Regards,

Alberto
Comment
Umar Farooq

Join Date: Apr 2014

Posts: 4
#6

28 Apr 2014, 03:20

these are useful information. However, I want to know that is there any tool of data mining (such as stepwise regression) in fixed and random effect modeling in stata?
Regards
Comment
Umar Farooq

Join Date: Apr 2014

Posts: 4
#7

28 Apr 2014, 03:22

Help..
When I used
xtreg Dep Ind i.Time i.Sector. fe
then i find that all industry data is ommited due to collinearity. Plz can you explain what does it mean?
Regards
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2464
#8

28 Apr 2014, 07:48

Hi Umar,
There is not enough information in our post, so i ll guess details here.
If sector (Im guessing your industry variable) is being drop, I guess your panel identifier perfectly overlaps all industry cases. For example, the panel ID is industry code at 4 digits, but the variable sector is the industry code at 1 digit.
Regarding the other question, Im not aware of general data mining tools for panel data. You would do better if you have a specific idea of what you want to do, and start from there.
Best regards,
Fernando
Comment
Alfonso Sánchez-Peñalver

Join Date: Mar 2014

Posts: 432
#9

28 Apr 2014, 09:17

Hola Alberto, hola Fernando,

I have a question about how the grouping works. It's my understanding that by using egen, group, you are effectively creating a panel for each combination of the two variables. Notice, then, that the actual effect that is being accounted for is the effect that the industry had in a given country. I say this because in doing that you will not be able to separate the effects of the country from the effects of the industry, and you are assuming that the effects of an industry are country specific, something that is not necessarily true. In reality you could have country specific effects, industry specific effects, and country-industry specific effects. In your estimation, if my understanding is correct, you would only be taking into account country-industry effects. You can simply use OLS including interaction of the two categorical variables, i.e.

Code:

regress respvar expvars i.country##i.industry

The only downfall to this estimation is the loss of degrees of freedom because of the inclusion of all the binary variables, but with a large dataset that shouldn't be a problem. It allows for testing whether the country specific and industry specific effects are jointly insignificant which is what you seem to be assuming to start with.

Best,

Alfonso Sanchez-Penalver
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2464
#10

28 Apr 2014, 16:13

Hi Alfonso,
You are absolutely right. That is why I suggested an option if one doesnt one to "combine" the country/industry effect, and refer to the -gpreg- Command.
In the end, it really depends on what Alberto is trying to infer from his results, and the capacity of the computer he is using.
I wonder, however, if one could test the same thing as you suggest by making a test not on the fixed effects, but on the parameters of the rest of the variables. The reason for this comes because there could be cases where directly estimation of the fixed effects (like in employer employee linked data), estimating the fixed effects is unfeasible.
In those cases, My suggestion is to "absorb" the fixed effects from all other variables, run the OLS on the transformed data, and then correct Sigmas for degrees of freedom.
Fernando
1 like
Comment
ershibuyou

Join Date: May 2014

Posts: 4
#11

16 Aug 2014, 11:27

Originally posted by Alfonso Sánchez-Peñalver View Post

Hola Alberto, hola Fernando,

I have a question about how the grouping works. It's my understanding that by using egen, group, you are effectively creating a panel for each combination of the two variables. Notice, then, that the actual effect that is being accounted for is the effect that the industry had in a given country. I say this because in doing that you will not be able to separate the effects of the country from the effects of the industry, and you are assuming that the effects of an industry are country specific, something that is not necessarily true. In reality you could have country specific effects, industry specific effects, and country-industry specific effects. In your estimation, if my understanding is correct, you would only be taking into account country-industry effects. You can simply use OLS including interaction of the two categorical variables, i.e.

Code:

regress respvar expvars i.country##i.industry

The only downfall to this estimation is the loss of degrees of freedom because of the inclusion of all the binary variables, but with a large dataset that shouldn't be a problem. It allows for testing whether the country specific and industry specific effects are jointly insignificant which is what you seem to be assuming to start with.

Best,

Hi Alfonso,

Why don`t you use the following code:
xtset country
xtreg respvar expvars i.year i.country##i.industry, fe r

Look forward to your replay!

Buyou

Last edited by ershibuyou; 16 Aug 2014, 11:32.
Comment
ershibuyou

Join Date: May 2014

Posts: 4
#12

16 Aug 2014, 11:29

Originally posted by Alfonso Sánchez-Peñalver View Post

Hola Alberto, hola Fernando,

I have a question about how the grouping works. It's my understanding that by using egen, group, you are effectively creating a panel for each combination of the two variables. Notice, then, that the actual effect that is being accounted for is the effect that the industry had in a given country. I say this because in doing that you will not be able to separate the effects of the country from the effects of the industry, and you are assuming that the effects of an industry are country specific, something that is not necessarily true. In reality you could have country specific effects, industry specific effects, and country-industry specific effects. In your estimation, if my understanding is correct, you would only be taking into account country-industry effects. You can simply use OLS including interaction of the two categorical variables, i.e.

Code:

regress respvar expvars i.country##i.industry

The only downfall to this estimation is the loss of degrees of freedom because of the inclusion of all the binary variables, but with a large dataset that shouldn't be a problem. It allows for testing whether the country specific and industry specific effects are jointly insignificant which is what you seem to be assuming to start with.

Best,

Hi Alfonso,

Why don`t you use the following code:
xtset country
xtreg respvar expvars i.year i.country##i.industry, fe

Many thanks!

Buyou
Comment
Alfonso Sánchez-Peñalver

Join Date: Mar 2014

Posts: 432
#13

21 Aug 2014, 05:08

The fixed effects estimator is equivalent to an OLS estimation where you include dummy variables for the categories of the group variable you set in xtset.

Thus your code

Code:

xtset country xtreg respvar expvars i.year i.country##i.industry, fe

will drop the dummy variables for country because of collinearity. Both will provide the same results, except that in mine I didn't include fixed effects for the years that you do. But

Code:

reg respvar expvars i.year i.country##i.industry

will return the same slopes for all the other variables others than the country dummies, which xtreg will not return

Best,

Alfonso

Alfonso Sanchez-Penalver
Comment

Announcement

Three dimensional panel data regression

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment