panel data analysis- how analyse separately each variable

elena mengo

Join Date: Jun 2015

Posts: 30
#1

panel data analysis- how analyse separately each variable

16 Jun 2015, 10:06

Hello,

I am doing a panel data analysis regarding 10 different countries from 1990 to 2010. I have already posted a question last week because of the troubles I was having with the results I have got.
So, my dissertation supervisor suggested to limit the analysis to each country in my sample to see if there is one or more country which may effect my results. Which code should I use to do that? and how then can I obtain the graph with the results for each country?

I did already tried this code: . keep if cnt == 2
but when I try to run the regression with stata this is what I get: no observations
r(2000)

Thank you in advance to those who will help

Elena

Attached Files

diss.dta (3.1 KB, 1 view)
Tags: None
Nick Cox

Join Date: Mar 2014

Posts: 35691
#2

16 Jun 2015, 10:26

Sorry, but "the graph with the results for each country" just isn't a clear specification. Please spell out exactly what kind of graph(s) you need.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17707
#3

16 Jun 2015, 23:53

Elena:
as Nick pointed out, I'm not clear with your explanation, nor research goal.
Perhaps you may want to try something like:

Code:

bysort country: xtreg<depvar> <indepvars>, <specification>

.

Besides, I would use -if- instead of -keep-:

Code:

xtreg<depvar> <indepvars> if cnt==2, <specification>

Unfortunately, the attached .dta file seems empty; hence, I cannot reproduce the error you experienced.

Kind regards,
Carlo
(Stata 19.0)
Comment
elena mengo

Join Date: Jun 2015

Posts: 30
#4

17 Jun 2015, 05:00

Sorry I will try to explain again because I understand it was not clear. Basically I have 10 different countries and for each of them I have data from 1990 to 2010, so I would analyse each country independently from the others because when I analyse them together I do not obtain the results which I'd expect so I think I should check if there is one or more country in my sample which affect my study. The aim of my study is to verify the relationship between co2 emissions, gdp, gdp squared fdi and trade in those countries and the gdp coefficient should be positive. However, when I run a regression including all the countries the gdp coefficient is negative while it should be positive instead so I have been told to analyse each country separately. I should obtain an inverted U relationship but with the panel data analysis is not working whereas with a simple OLS regression I do get the signs of the coefficients I expect. How is it possible?

Attached Files

diss3.dta (10.0 KB, 1 view)
Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17707

17 Jun 2015, 05:25

Elena:
by imposing the condition -if cnt==2- the results are (as expected) the same for both panel data analysis (-xtreg, fe) and OLS.
Beside, both linear and squared terms for gdp are not statistical significant:

Code:

. xtset cnt year
       panel variable:  cnt (strongly balanced)
        time variable:  year, 1990 to 2010
                delta:  1 unit

. 
. xtreg co2 fdi trade c.gdp##c.gdp if cnt==2, fe

Fixed-effects (within) regression               Number of obs      =        18
Group variable: cnt                             Number of groups   =         1

R-sq:  within  = 0.6713                         Obs per group: min =        18
       between =      .                                        avg =      18.0
       overall = 0.6713                                        max =        18

                                                F(4,13)            =      6.64
corr(u_i, Xb)  =      .                         Prob > F           =    0.0039

------------------------------------------------------------------------------
         co2 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         fdi |  -.0947555   .0392718    -2.41   0.031    -.1795971   -.0099139
       trade |   .0439507   .0167048     2.63   0.021      .007862    .0800393
         gdp |  -.0002915   .0001622    -1.80   0.096     -.000642    .0000589
             |
 c.gdp#c.gdp |   3.41e-09   5.05e-09     0.67   0.512    -7.50e-09    1.43e-08
             |
       _cons |   10.41749   .9162462    11.37   0.000     8.438057    12.39692
-------------+----------------------------------------------------------------
     sigma_u |          .
     sigma_e |  .41770505
         rho |          .   (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0:     F(0, 13) =        .               Prob > F =      .

. 
. reg co2 fdi trade c.gdp##c.gdp if cnt==2

      Source |       SS       df       MS              Number of obs =      18
-------------+------------------------------           F(  4,    13) =    6.64
       Model |  4.63255633     4  1.15813908           Prob > F      =  0.0039
    Residual |  2.26820764    13  .174477511           R-squared     =  0.6713
-------------+------------------------------           Adj R-squared =  0.5702
       Total |  6.90076397    17  .405927293           Root MSE      =  .41771

------------------------------------------------------------------------------
         co2 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         fdi |  -.0947555   .0392718    -2.41   0.031    -.1795971   -.0099139
       trade |   .0439507   .0167048     2.63   0.021      .007862    .0800393
         gdp |  -.0002915   .0001622    -1.80   0.096     -.000642    .0000589
             |
 c.gdp#c.gdp |   3.41e-09   5.05e-09     0.67   0.512    -7.50e-09    1.43e-08
             |
       _cons |   10.41749   .9162462    11.37   0.000     8.438057    12.39692
------------------------------------------------------------------------------

Kind regards,
Carlo
(Stata 19.0)

Comment

elena mengo

Join Date: Jun 2015

Posts: 30
#6

17 Jun 2015, 06:38

Thanks Carlo,

I did run 10 different regressions and I found only 4 out of 10 countries to have the gdp and gdp squared coefficient' signs as expected and none of them is statistically significant which I do not know if might be an issue. Also, my supervisor told me to take the log of all variables; how this would affect my results?

Elena
Comment
Jorrit Gosens

Join Date: Jan 2015

Posts: 1019
#7

17 Jun 2015, 06:50

The analysis does not support the hypothesis, therefore the analysis must be wrong? That's not the best advice a supervisor could give.

I would advise you to start with a round of data exploration. You would see that CO2 emissions are pretty much falling for each of the countries, over the time periods analyzed, whilst GDP increases. Make a graph of each country with both GDP and CO2 emissions, and show those to your supervisor. The relationship youre looking for simply is not there. This is not unexpected. Your selection of countries (mostly eastern Europe) has had a big push in energy efficiency in transport and power generation, as well as a move away from lignite as a fuel used in power generation. The inverted U shape you are looking for might work for a longer period of data and/or a more varied selection of countries, but not here. Here, you are just seeing the downward sloping bit of the curve, i.e., past the peak of the U.
It should be noted that this 'environmental kuznets curve', in particular in reference to energy and/or CO2, is still very much a subject of debate.
As far as I'm concerned, you've not been able to identify such a pattern, and your conclusion should be you could not find evidence supporting it.

-edit-
In case you would care to expand your dataset, there's a wealth of info you could use in the World Bank's 'World development indicator' data set: here

Last edited by Jorrit Gosens; 17 Jun 2015, 06:52.
Comment
elena mengo

Join Date: Jun 2015

Posts: 30
#8

17 Jun 2015, 06:58

Hi Jorrit,

Indeed as you said I believe that simply the relationship I am looking for is not there and all the reasons you said above sounds reasonable. I am trying as well to make a graph for each country, I used the command
xtset cnt_2 year
xtline co2 gdp, tlabel(#3)

but I am not sure is the right code because I do obtain to different lines which do not really say much to me. What do you think is best?

Elena

Attached Files

Graph.gph (62.8 KB, 1 view)
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17707
#9

17 Jun 2015, 06:59

Elena:
I suspect that the reported mismatch between theory and results is partially due to the limited sample size you are dealing with. The usual recommendation in this case is to try to collect more data (an advice that is often difficult to translate in practice) or to explain in the discussion of your results that the mismatch may be due to a limited sample size (this is also an indirect reply to a questions of yours: I do not see statistical insignificance as an issue, as it may be simply due to the fact that your sample is too limited. I (and probably some thousands of people around the world) find confidence intervals more meaningful).

As far as log-log regression is concerned, quoting Stock JH, Watson MW. Introduction to Econometrics, 2nd edition. Pearson International Edition, 2007: page 273:

A 1% change in X is associated with a beta₁% change in Y, so beta₁ is the elasticity of Y with respect to X

.

Kind regards,
Carlo
(Stata 19.0)
Comment
Jorrit Gosens

Join Date: Jan 2015

Posts: 1019
#10

17 Jun 2015, 07:38

You could make an index of your gdp and co2 variables (i.e., setting the 1990 value to 100).
Unfortunately, you have missing values for a number of countries for 1990 values of gdp and/or co2 emissions. You could update or replace with data form the link I provided above; that has all data you need.
create indexes as follows:

Code:

xtset cnt_2 year sort cnt_2 year bysort cnt_2: gen index_gdp = (gdp/gdp[1])*100 bysort cnt_2: gen index_co2 = (co2/co2[1])*100 xtline index_co2 index_gdp, tlabel(#3)

Last edited by Jorrit Gosens; 17 Jun 2015, 07:50.
Comment
elena mengo

Join Date: Jun 2015

Posts: 30
#11

17 Jun 2015, 10:49

Thanks for the code Jorrit. The data I used are from the world bank data set and I made sure they correspond. I think that as has been said above my sample size is too small, so either I take into account more than 20 years or simply I deal with the fact that with the data I have I cannot confirm my hypotheses. Third solution, I could introduce a new variable named energy consumption per capita and see if there is a difference before and after those countries became part of the EU....this might be a solution

Thanks everybody for your help
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17707
#12

17 Jun 2015, 12:33

Elena:
introducing a new variable would only make things worse.
Please consider that the (probably too severe) rule of thumb in multiple (i.e., multivariable) regression is that there should be 20 observations for each predictor included in the right hand side of the equation. Let's say that, in many cases, 10 observations per predictor may sound wise: if you have 18 observations per country and decide to go OLS with each country separately, your regression should consider no more than two predictors. Otherwise, you're asking too much out of your data.

References:
Mitchell H. Katz. Multivariable analysis. 2nd edition. Cambridge University Press 2015: 80-81.

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement

panel data analysis- how analyse separately each variable

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment