Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Logit Fixed Effects and Random Effects Regression with multiple observations per time period (season) xtset

    Code:
    clear
    input int MatchID byte Result str8(Team Opponent) byte Home double(WINPCT OWINPCT) int season byte Matchday
       8 0 "HIFK"     "Ässät"  1 0 1 2005 2
      11 0 "JYP"      "Blues"    1 1 0 2005 2
       8 1 "Ässät"  "HIFK"     0 1 0 2005 2
      14 0 "Ilves"    "HPK"      1 0 1 2005 2
      14 1 "HPK"      "Ilves"    0 1 0 2005 2
      12 1 "Pelicans" "Jokerit"  1 0 1 2005 2
      11 1 "Blues"    "JYP"      0 0 1 2005 2
      12 0 "Jokerit"  "Pelicans" 0 1 0 2005 2
    end



    Hello all,

    at the moment I deal with seasonal ice hockey data and want to estimate a fixed and random effects logit regression model. For each Match my dataset contains two Observations and each team plays multiple times per season (some seasons have different amounts of Matchdays). Has anybody a suggestion on how to correctly use the xtset command in order to account for the fact that game j in season k can be correlated with game j +1 in season k. Furthermore, I am unsure whether using Team as a panel variable is correct, because of the two observations per Game.

    I appreciate any helps on how to solve with this kind of problem, thank you!

    With kind regards,

    Malte

    P.S: With the dataset Result = 1 if a team won a game and Home = 1 if a team played at home. I want to regress for example Result ~ Home + WINPCT + OWINPCT, and therefore need two observations per Match in order to be able to account for the opponents strength etc.
    Last edited by Malte Bischof; 25 Feb 2022, 08:09.

  • #2
    Malte:
    an idea might be:
    Code:
    encode Team, g(num_Team)
    encode Opponent , g(num_Opponent )
    xtset MatchID
    xtlogit Result i.num_Team  i.Home, fe
    In addition, please note:
    1) no matter yoir -timevar- you Stata throws the error message -repeated time values within panel-. Hence, you should -xtset- your panel dataset with -panelid- only. However, this approach comes at the cost of making time-sereis operators unavailable;
    2) -xtlogit,fe- gives back conditional fixed effect (due to incidental parameter bias; see
    http://www.econ.brown.edu/Faculty/Tony_Lancaster/papers/IncidentalParameters1948.pdf);
    3) I would take a look at some textbook/article of panel data econometrics applied to sports data, just to grasp what is done in that research field when presented with this kind of data (obviously, this may well be an unsolicited advice because, unlike me, you're perfectly aware of that).

    As an aside please follow the FAQ and do not bump the same question (
    https://www.statalist.org/forums/forum/general-stata-discussion/general/1652139-what-is-the-correct-panelid-in-fixed-effects-logit-regression-when-having-two-observations-per-unit-game-data-in-sports-f-e).
    There are many reasons for a (perceived) belate reply and, as you know, nobody on this list is forced to answering. Thanks.
    Last edited by Carlo Lazzaro; 27 Feb 2022, 02:41.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Hey Carlo,

      thank you very much for your answer and sorry for posting a similar question twice.

      I followed your suggestions but I think this will not solve my problem, where I want to account for the possibility that game j in season k can be correlated with game j+1.
      My next approach was that it would be sufficient to calculate a team_seasonID through egen team_seasonID = group(num_Team season). But then I have problems with including i.opponent_seasonID aswell because Stata will not calculate any results.

      So I kindly ask you if you maybe have another suggestion on how to account for on the one hand the clustered data structure and on the other hand for the fact that I want to account for the seasonal fixed effects of each team? Sadly, I dont find any econometric books/papers where authors describe the procedure on how to deal with such matchdata and how to handle Data where one have a observation for the team and the opponent aswell.

      Any suggestions are appreciated and thank you very much in advance.

      Kind Regards,
      Malte


      Comment


      • #4
        Malte:
        1) in regression, concerns about correlation within observation belonging to the same panel is addressed via -vce(cluster clusterid)-. However, for sound methodological reasons, -xtlogit- does not support -vce(cluster clusterid)- standard error; an alternative is bootstrapped standard error;
        2) if -opponent- ids perfectly collinear with -Team- Stata omits that variable;
        3) maybe you can skip -Opponent- as a predictor and use -Home- instead.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Originally posted by Carlo Lazzaro View Post
          Malte:
          an idea might be:
          Code:
          encode Team, g(num_Team)
          encode Opponent , g(num_Opponent )
          xtset MatchID
          xtlogit Result i.num_Team i.Home, fe
          In addition, please note:
          1) no matter yoir -timevar- you Stata throws the error message -repeated time values within panel-. Hence, you should -xtset- your panel dataset with -panelid- only. However, this approach comes at the cost of making time-sereis operators unavailable;
          2) -xtlogit,fe- gives back conditional fixed effect (due to incidental parameter bias; see
          http://www.econ.brown.edu/Faculty/Tony_Lancaster/papers/IncidentalParameters1948.pdf);
          3) I would take a look at some textbook/article of panel data econometrics applied to sports data, just to grasp what is done in that research field when presented with this kind of data (obviously, this may well be an unsolicited advice because, unlike me, you're perfectly aware of that).
          Can you share this video again? Now I am frugal with buying something and money in general. I realized that the gambling I had played before required a lot of money. Now I have changed my strategy and only play online pokies. At https://pokiesman.com/aristocrat/ you can read a review about Free Online Aristocrat Pokies.
          As an aside please follow the FAQ and do not bump the same question (
          https://www.statalist.org/forums/forum/general-stata-discussion/general/1652139-what-is-the-correct-panelid-in-fixed-effects-logit-regression-when-having-two-observations-per-unit-game-data-in-sports-f-e).
          There are many reasons for a (perceived) belate reply and, as you know, nobody on this list is forced to answering. Thanks.
          I am very grateful to you for the explanation.

          Comment

          Working...
          X