Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • panel data analysis- how analyse separately each variable

    Hello,

    I am doing a panel data analysis regarding 10 different countries from 1990 to 2010. I have already posted a question last week because of the troubles I was having with the results I have got.
    So, my dissertation supervisor suggested to limit the analysis to each country in my sample to see if there is one or more country which may effect my results. Which code should I use to do that? and how then can I obtain the graph with the results for each country?

    I did already tried this code: . keep if cnt == 2
    but when I try to run the regression with stata this is what I get: no observations
    r(2000)

    Thank you in advance to those who will help

    Elena

    Attached Files

  • #2
    Sorry, but "the graph with the results for each country" just isn't a clear specification. Please spell out exactly what kind of graph(s) you need.

    Comment


    • #3
      Elena:
      as Nick pointed out, I'm not clear with your explanation, nor research goal.
      Perhaps you may want to try something like:
      Code:
      bysort country: xtreg<depvar> <indepvars>, <specification>
      .

      Besides, I would use -if- instead of -keep-:

      Code:
      xtreg<depvar> <indepvars> if cnt==2, <specification>
      Unfortunately, the attached .dta file seems empty; hence, I cannot reproduce the error you experienced.
      Kind regards,
      Carlo
      (Stata 19.0)

      Comment


      • #4
        Sorry I will try to explain again because I understand it was not clear. Basically I have 10 different countries and for each of them I have data from 1990 to 2010, so I would analyse each country independently from the others because when I analyse them together I do not obtain the results which I'd expect so I think I should check if there is one or more country in my sample which affect my study. The aim of my study is to verify the relationship between co2 emissions, gdp, gdp squared fdi and trade in those countries and the gdp coefficient should be positive. However, when I run a regression including all the countries the gdp coefficient is negative while it should be positive instead so I have been told to analyse each country separately. I should obtain an inverted U relationship but with the panel data analysis is not working whereas with a simple OLS regression I do get the signs of the coefficients I expect. How is it possible?

        Attached Files

        Comment


        • #5
          Elena:
          by imposing the condition -if cnt==2- the results are (as expected) the same for both panel data analysis (-xtreg, fe) and OLS.
          Beside, both linear and squared terms for gdp are not statistical significant:
          Code:
          . xtset cnt year
                 panel variable:  cnt (strongly balanced)
                  time variable:  year, 1990 to 2010
                          delta:  1 unit
          
          . 
          . xtreg co2 fdi trade c.gdp##c.gdp if cnt==2, fe
          
          Fixed-effects (within) regression               Number of obs      =        18
          Group variable: cnt                             Number of groups   =         1
          
          R-sq:  within  = 0.6713                         Obs per group: min =        18
                 between =      .                                        avg =      18.0
                 overall = 0.6713                                        max =        18
          
                                                          F(4,13)            =      6.64
          corr(u_i, Xb)  =      .                         Prob > F           =    0.0039
          
          ------------------------------------------------------------------------------
                   co2 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
          -------------+----------------------------------------------------------------
                   fdi |  -.0947555   .0392718    -2.41   0.031    -.1795971   -.0099139
                 trade |   .0439507   .0167048     2.63   0.021      .007862    .0800393
                   gdp |  -.0002915   .0001622    -1.80   0.096     -.000642    .0000589
                       |
           c.gdp#c.gdp |   3.41e-09   5.05e-09     0.67   0.512    -7.50e-09    1.43e-08
                       |
                 _cons |   10.41749   .9162462    11.37   0.000     8.438057    12.39692
          -------------+----------------------------------------------------------------
               sigma_u |          .
               sigma_e |  .41770505
                   rho |          .   (fraction of variance due to u_i)
          ------------------------------------------------------------------------------
          F test that all u_i=0:     F(0, 13) =        .               Prob > F =      .
          
          . 
          . reg co2 fdi trade c.gdp##c.gdp if cnt==2
          
                Source |       SS       df       MS              Number of obs =      18
          -------------+------------------------------           F(  4,    13) =    6.64
                 Model |  4.63255633     4  1.15813908           Prob > F      =  0.0039
              Residual |  2.26820764    13  .174477511           R-squared     =  0.6713
          -------------+------------------------------           Adj R-squared =  0.5702
                 Total |  6.90076397    17  .405927293           Root MSE      =  .41771
          
          ------------------------------------------------------------------------------
                   co2 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
          -------------+----------------------------------------------------------------
                   fdi |  -.0947555   .0392718    -2.41   0.031    -.1795971   -.0099139
                 trade |   .0439507   .0167048     2.63   0.021      .007862    .0800393
                   gdp |  -.0002915   .0001622    -1.80   0.096     -.000642    .0000589
                       |
           c.gdp#c.gdp |   3.41e-09   5.05e-09     0.67   0.512    -7.50e-09    1.43e-08
                       |
                 _cons |   10.41749   .9162462    11.37   0.000     8.438057    12.39692
          ------------------------------------------------------------------------------
          Kind regards,
          Carlo
          (Stata 19.0)

          Comment


          • #6
            Thanks Carlo,

            I did run 10 different regressions and I found only 4 out of 10 countries to have the gdp and gdp squared coefficient' signs as expected and none of them is statistically significant which I do not know if might be an issue. Also, my supervisor told me to take the log of all variables; how this would affect my results?

            Elena

            Comment


            • #7
              The analysis does not support the hypothesis, therefore the analysis must be wrong? That's not the best advice a supervisor could give.

              I would advise you to start with a round of data exploration. You would see that CO2 emissions are pretty much falling for each of the countries, over the time periods analyzed, whilst GDP increases. Make a graph of each country with both GDP and CO2 emissions, and show those to your supervisor. The relationship youre looking for simply is not there. This is not unexpected. Your selection of countries (mostly eastern Europe) has had a big push in energy efficiency in transport and power generation, as well as a move away from lignite as a fuel used in power generation. The inverted U shape you are looking for might work for a longer period of data and/or a more varied selection of countries, but not here. Here, you are just seeing the downward sloping bit of the curve, i.e., past the peak of the U.
              It should be noted that this 'environmental kuznets curve', in particular in reference to energy and/or CO2, is still very much a subject of debate.
              As far as I'm concerned, you've not been able to identify such a pattern, and your conclusion should be you could not find evidence supporting it.


              -edit-
              In case you would care to expand your dataset, there's a wealth of info you could use in the World Bank's 'World development indicator' data set: here

              Last edited by Jorrit Gosens; 17 Jun 2015, 06:52.

              Comment


              • #8
                Hi Jorrit,

                Indeed as you said I believe that simply the relationship I am looking for is not there and all the reasons you said above sounds reasonable. I am trying as well to make a graph for each country, I used the command
                xtset cnt_2 year
                xtline co2 gdp, tlabel(#3)

                but I am not sure is the right code because I do obtain to different lines which do not really say much to me. What do you think is best?

                Elena

                Attached Files

                Comment


                • #9
                  Elena:
                  I suspect that the reported mismatch between theory and results is partially due to the limited sample size you are dealing with. The usual recommendation in this case is to try to collect more data (an advice that is often difficult to translate in practice) or to explain in the discussion of your results that the mismatch may be due to a limited sample size (this is also an indirect reply to a questions of yours: I do not see statistical insignificance as an issue, as it may be simply due to the fact that your sample is too limited. I (and probably some thousands of people around the world) find confidence intervals more meaningful).

                  As far as log-log regression is concerned, quoting Stock JH, Watson MW. Introduction to Econometrics, 2nd edition. Pearson International Edition, 2007: page 273:
                  A 1% change in X is associated with a beta1% change in Y, so beta1 is the elasticity of Y with respect to X
                  .

                  Kind regards,
                  Carlo
                  (Stata 19.0)

                  Comment


                  • #10
                    You could make an index of your gdp and co2 variables (i.e., setting the 1990 value to 100).
                    Unfortunately, you have missing values for a number of countries for 1990 values of gdp and/or co2 emissions. You could update or replace with data form the link I provided above; that has all data you need.
                    create indexes as follows:
                    Code:
                    xtset cnt_2 year
                    sort cnt_2 year
                    bysort cnt_2: gen index_gdp = (gdp/gdp[1])*100
                    bysort cnt_2: gen index_co2 = (co2/co2[1])*100
                    xtline index_co2 index_gdp, tlabel(#3)
                    Last edited by Jorrit Gosens; 17 Jun 2015, 07:50.

                    Comment


                    • #11
                      Thanks for the code Jorrit. The data I used are from the world bank data set and I made sure they correspond. I think that as has been said above my sample size is too small, so either I take into account more than 20 years or simply I deal with the fact that with the data I have I cannot confirm my hypotheses. Third solution, I could introduce a new variable named energy consumption per capita and see if there is a difference before and after those countries became part of the EU....this might be a solution

                      Thanks everybody for your help

                      Comment


                      • #12
                        Elena:
                        introducing a new variable would only make things worse.
                        Please consider that the (probably too severe) rule of thumb in multiple (i.e., multivariable) regression is that there should be 20 observations for each predictor included in the right hand side of the equation. Let's say that, in many cases, 10 observations per predictor may sound wise: if you have 18 observations per country and decide to go OLS with each country separately, your regression should consider no more than two predictors. Otherwise, you're asking too much out of your data.

                        References:
                        Mitchell H. Katz. Multivariable analysis. 2nd edition. Cambridge University Press 2015: 80-81.
                        Kind regards,
                        Carlo
                        (Stata 19.0)

                        Comment

                        Working...
                        X