Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Correcting for heteroskedasticity and Auto Correlation in an unbalance panel

    Hello,
    I am needing STATA help. I am replicating an economic paper to graduate an MS econometrics program. The paper I am replicating examines how a baseball teams win percentage(depvar) is influenced by intra team wage disparity. The independent variables are x1- Total team salary, x2 SalaryHHI ( Hershendal Index) which is the summation of of each players share of total team salary^2 , and x3 which is time(year)
    My results are nowhere near the authors and it is obviously not a simple scaling issue. The unbalance is a result of expansions in the number of teams in MLB over the years.

    The panel variable is teamcode which identifies the team.
    my regression is:
    xtreg winpct totalsal salhhi year, fe
    Can anyone shed light on unbalanced panel data in STATA that addresses "correcting for heteroskedasticity and auto-correlation"'.
    Thank you
    Last edited by Carter Forbes; 17 Oct 2017, 18:23.

  • #2
    Carter:
    welcome to the list.
    -Stata (not STATA, please), can handle both balanced and unbalanced panel datasets with no problem.
    - if you have a large N, small T panel dataset,and you suspect heteroskedasticity and/or autocorrelation (usually the latter does not bite in that type of panel datasets), you can simply robustify/cluster your standard errors on -panelid-;
    - as an aside, please read the FAQ on how to post more effectively: the details you gave cannot lead to more helpful replies;
    - eventually, please read the FAQ about posts related to statistical assignments and decide whether it is or not your case.
    Kind regards,
    Carlo
    (Stata 18.0 SE)

    Comment


    • #3
      Here is a tiny contribution to your project.

      I'd say that "autocorrelation" is usually so written with no hyphen and certainly not as two words.

      Hersendahl is the wrong spelling. The name Herfindahl (sometimes Herfindahl-Hirschman) for an index based on sum of squared probabilities probably arose because some economists were unaware that the measure was one of several thought up earlier by Corrado Gini (and because Turing's yet further invention was classified (i.e. secret) until long after the Second World War). There are yet other earlier uses than Herfindahl. Still, whoever is reading your work will expect to see the name Herfindahl.

      Comment


      • #4
        As long as I can remember, the Herfindhal index is also used in marketing/competitors analysis (sum of squared market shares).
        Kind regards,
        Carlo
        (Stata 18.0 SE)

        Comment


        • #5
          As I recall Hirschman got mighty worked up about Herfindahl being cited, as his loosely similar idea was published first, But both were just reinventing ideas repeatedly invented (or if you prefer discovered) earlier.

          Clarity is in practice a big deal which is why people do, and arguably should, use names familiar within their own field. But it's axiomatic that whatever history of ideas you think you're echoing is usually wrong.

          Comment


          • #6
            Thank you all for the help. I strongly suspect that the database I pulled the data from has been "corrected" over the years and this is likely why I do not get the same results. I've thrown everything I have at this seemingly simple project short of the unspeakable. Nick, referencing the spelling of autocorrelation, thanks for pointing that out. The word processors on these pages tempt the hyphen. I did ignorantly misspell Herfindahl. Sorry about that. Carlo, I will certainly read the guidelines and do not wish to be a problem.

            I noticed that the fixed effects results in the author's table had no constant. I then created team dummies and ran LSDV without a constant and the results worsened. The author hasn't been able to figure out the differences either. The original paper was done in a statistical package called TSP. I highly doubt there are any significant deviations in panel methods from program to program. I get the same results in Stata, R, and Gauss. The methodology implied by the paper should give the same results given the data is being processed into the software correctly. I am at a loss for solutions. Replication in the purest sense of the word is looking improbable. Thank you.

            Comment


            • #7
              Carter:
              perhaps authors did not report a constant because, in -xtreg-, is less informative than in -regress-.
              That said, I would email the corrisponding author, asking for clarifications (that's what corresponding author should be for, after all).
              Kind regards,
              Carlo
              (Stata 18.0 SE)

              Comment

              Working...
              X