Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Pooled OLS and Panel Data Analysis

    Hello everyone,

    My question is rather simple but since i am very new to econometrics I am struggling a bit. . I have a panel data set of 605 public companies across 20 industries for a period of 10 years (2008-2017). First, I am trying to do a Pooled OLS regression but i am not entirely sure how exactly i should run the code.
    My dependent variable is Return on Assets (ROA), my independent is customer-base concentration (BCR), my moderating variable is Customer Type (CT) binary indicating 0 for Company and 1 for the Government and i have few control variables.I am interested in the effects of my independent variable BCR on ROA and the moderation effect of CT. I am trying to figure out which is the most appropriate regress command given my data.
    This is how my data looks like.
    Code:
    ID  Year      BCR           CT    ROA            I    NC   CA        FS                  RG               MS         GDPG       GEG            HHI
    1    2008    .0686031      0    .09834008    34    3    36    17.631551    .08070548    .00072498    -.292    .08514918    .03974101
    1    2009    .04000135    0    .06717247    34    1    37    17.56051    -.16179479    .00070308    -2.776    .15207504    .03890317
    1    2010    .04409807    0    .05189488    34    1    38    17.719118    .06334225    .00076577    2.532    -.01793981    .04010741
    1    2011    .03999782    0    .05090363    34    1    39    17.826872    .13850918    .0007792      1.601    .04079933    .03907887
    1    2012    .04329927    0    .05232352    34    2    40    18.032486    .13118407    .00094437    2.224    -.01865988    .04638447
    1    2013    .04810036    0    .05879934    34    2    41    18.036179    .05812876    .00087609   .01677    -.02521739    .03570392
    1    2014    .0340006      0    .06038483    34    2    42    18.18885      .16456511    .00104287      .02569    .01597262    .03560585
    1    2015    .02879808    0    .05887916    34    2    43    18.215144    .02358576    .00109453    .02862    .04934924    .03175518
    1    2016    .03169894    0    .06355223    34    2    44    18.338016    .11849985    .00126561    .01485    .04282377    .03257272
    1    2017    .03770084    0    .0353177      34    2    45    18.558092    .04577556    .00126758    .02273    .03239578    .03572151
    2    2008    .3700219      1    -.05309908   36    2    30    17.237229    -.40683181    .00001877    -.292    .08514918    .00400398
    2    2009    .47557653    1    .07520448    36    2    31    17.262987    .31490943    .00002982    -2.776    .15207504    .00416665
    2    2010    .53841282    1    -.01896988   36    2    32    17.364898    -.07840795    .00002342    2.532    -.01793981    .00411381
    2    2011    .4385088      1    -.01548027   36    2    33    17.276454    -.07675075    .00002109    1.601    .04079933    .00457852
    2    2012    .44370862    1    .06080683    36    2    34    17.340694    .12590659    .00003511    2.224    -.01865988    .0107515
    2    2013    .47561341    1    .03356553    36    2    35    17.342547    -.02046405    .00002362    .01677    -.02521739    .005485
    2    2014    .46248789    1    .04388147    36    2    36    17.42605     .12747409    .00002721    .02569    .01597262    .00572547
    2    2015    .4212063      1    .0263885      36    2    37    17.490519    -.04202274    .00002602    .02862    .04934924    .0057234
    2    2016    .48851316    1    .06320515    36    2    38    17.566049    .41364004    .0000456    .01485    .04282377    .00633841
    2    2017    .49249657    1    -.08471765   36    2    39    17.572072    -.28668613    .00003354    .02273    .03239578    .00676919
    I am thinking that this is the proper STATA command but given the lack of experience I am doubting myself.

    Code:
    regress ROA BCR Controls i.CT##c.BCR
    However i have seen examples where they include years and industries

    Code:
    regress ROA BCR Controls i.I i.Year i.CT##c.BCR
    Which is the better command to run a pooled OLS regression given my data? Also is there a way to investigate the effect of BCR on ROA across industries I as moderated by CT? I would assume that a sub sample analysis should be done and a command may look something like this

    Code:
    regress ROA BCR Controls i.I##c.BCR if CT==0 OR 1
    Furthermore, i would like to do a panel data analysis using fixed or random effects but i am again unsure how to do it. I have performed different hausman tests including different variables every time and it indicates that i should use fixed effects. I have come across numerous threads in the forum but nothing matches my search. Given my scarce knowledge and findings i think that the command for a panel regression with fixed effects is

    Code:
    xtreg ROA BCR Controls, fe
    However i have seen many people including year dummies

    Code:
    xtreg ROA BCR Controls i.Year, fe
    What is the difference between including the years dummies and not? Why do some people use lagged variables when performing panel data analysis?

    Thanks everyone in advance!
    Regards,
    Kristian
    Last edited by Kristian Tonev; 09 Dec 2018, 12:10.

  • #2
    Kristian:
    welcome to this forum.
    Some comments about your query:
    1) it's rare (but possible) that pooled OLS outperforms -xtreg- when it comes to panel data analysis. It happens when the F-test appearing as a footnote of the -xtreg- outcome table fails to reach statistical siginificance. However, -xtreg. should be your first choice;
    2) -hausman- test can help you out in deciding which specification (-fe- or -re-) fits your data better. As an aside, trying many different models is not the way to go; try to give a fair and true view of the data generating process, instead (the literature in your research field can support you in this respect);
    3) including -i.timevar- is perfectly legal, in that you can have an idea of the role played by time as a predictor, Under the -fe- specification, -i.timevar- will measure the contribution of time, when adjusted for the remaining predctors, to variation in the regressand, within the same panel;
    4) if the lagged variable is the regressand, you're entering the really tricky world of dynamic panel, which is not a field for beginners in econometrics.
    5) eventually, just to get yourself familiar with the building blocks of this fascinating (but demanding) quantitative field, I would recommend you to study any decent panel data econometrics textbook.
    Usually, Stata users dealing with this stuff have https://www.stata.com/bookstore/micr...metrics-stata/ ready on their bookshelves.
    Kind regards,
    Carlo
    (Stata 18.0 SE)

    Comment

    Working...
    X