Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Luca:
    could you post an example (let's say 20 items) of ATP tornaments (variable_1) and related ATP points (variable_2)? Thanks.
    Kind regards,
    Carlo
    (Stata 18.0 SE)

    Comment


    • #17
      Originally posted by Carlo Lazzaro View Post
      Luca:
      could you post an example (let's say 20 items) of ATP tornaments (variable_1) and related ATP points (variable_2)? Thanks.
      Sure Carlo.
      Code:
      Tournament    Series
      Qatar Exxon Mobil Open    ATP250
      Australian Open    Grand Slam
      ABN AMRO World Tennis Tournament    ATP500
      Monte Carlo Masters    Masters 1000
      Mutua Madrid Open    Masters 1000
      French Open    Grand Slam
      AEGON Championships    ATP250
      German Open Tennis Championships    ATP500
      Rogers Masters    Masters 1000
      Wimbledon    Grand Slam
      Swiss Indoors    ATP500
      Rio Open    ATP500
      Brisbane International    ATP250
      Open Sud de France    ATP250
      Internazionali BNL d'Italia    Masters 1000
      Shanghai Masters    Masters 1000
      US Open    Grand Slam
      St. Petersburg Open    ATP250
      Abierto Mexicano    ATP500
      BNP Paribas Masters    Masters 1000
      Croatia Open    ATP250

      Comment


      • #18
        Luca:
        this is not the way date example/excero should be reported.
        They need a surgery session to be in line with Stata requirements.
        Using -dataex- would have avoided bothering with this issue.
        Please do not take others' availability as granted. Thanks.
        That said:
        Code:
        input str20 (Tournament    Series)
        Qatar_Exxon_Mobil_Open    ATP250
        Australian_Open    Grand_Slam
        ABN_AMRO_World_Tennis_Tournament    ATP500
        Monte_Carlo_Masters    Masters_1000
        Mutua_Madrid_Open    Masters_1000
        French_Open    Grand_Slam
        AEGON_Championships    ATP250
        German_Open_Tennis_Championships    ATP500
        Rogers_Masters    Masters_1000
        Wimbledon    Grand_Slam
        Swiss_Indoors    ATP500
        Rio_Open    ATP500
        Brisbane_International    ATP250
        Open_Sud_de_France    ATP250
        Internazionali_BNL_Italia    Masters_1000
        Shanghai_Masters    Masters_1000
        US_Open    Grand_Slam
        St_Petersburg_Open    ATP250
        Abierto_Mexicano    ATP500
        BNP_Paribas_Masters    Masters_1000
        Croatia_Open    ATP250
        end
        encode Series, gen(num_Series)
        Last edited by Carlo Lazzaro; 05 Jul 2022, 10:32.
        Kind regards,
        Carlo
        (Stata 18.0 SE)

        Comment


        • #19
          Originally posted by Carlo Lazzaro View Post
          Luca:
          this is not the way date example/excero should be reported.
          They need a surgery session to be in line with Stata requirements.
          Using -dataex- would have avoided bothering with this issue.
          Please do not take others' availability as granted. Thanks.
          That said:
          Code:
          input str20 (Tournament Series)
          Qatar_Exxon_Mobil_Open ATP250
          Australian_Open Grand_Slam
          ABN_AMRO_World_Tennis_Tournament ATP500
          Monte_Carlo_Masters Masters_1000
          Mutua_Madrid_Open Masters_1000
          French_Open Grand_Slam
          AEGON_Championships ATP250
          German_Open_Tennis_Championships ATP500
          Rogers_Masters Masters_1000
          Wimbledon Grand_Slam
          Swiss_Indoors ATP500
          Rio_Open ATP500
          Brisbane_International ATP250
          Open_Sud_de_France ATP250
          Internazionali_BNL_Italia Masters_1000
          Shanghai_Masters Masters_1000
          US_Open Grand_Slam
          St_Petersburg_Open ATP250
          Abierto_Mexicano ATP500
          BNP_Paribas_Masters Masters_1000
          Croatia_Open ATP250
          end
          encode Series, gen(num_Series)
          Okay, but how do we continue from here? I already converted it to a numeric variable.

          Comment


          • #20
            Luca:
            just plug -i.num_series- as a predictor instead of -i.tournament- in the right-hand side of your -xtreg,fe- equation.
            Kind regards,
            Carlo
            (Stata 18.0 SE)

            Comment


            • #21
              Originally posted by Carlo Lazzaro View Post
              Luca:
              just plug -i.num_series- as a predictor instead of -i.tournament- in the right-hand side of your -xtreg,fe- equation.
              It didn't unite the tournament and the ATP points, but why don't we divide the sample into groups (ATP 250, ATP 500, Masters 1000, Grand Slam) and run the regressions on each group?

              Comment


              • #22
                Luca:
                with 4 different regressions you lose the big picture.
                I think you should better go:
                Code:
                xtset individualplayer 
                 xtreg GameswonbyIndividualPlayer IndividualHeterogeneity IndividualPrizeSpread1000 NumOfTourneys SpreadNumOfTourneys i.ATP_points i.Year, fe vce(cluster individualplayer)
                Kind regards,
                Carlo
                (Stata 18.0 SE)

                Comment


                • #23
                  Originally posted by Carlo Lazzaro View Post
                  Luca:
                  with 4 different regressions you lose the big picture.
                  I think you should better go:
                  Code:
                  xtset individualplayer
                  xtreg GameswonbyIndividualPlayer IndividualHeterogeneity IndividualPrizeSpread1000 NumOfTourneys SpreadNumOfTourneys i.ATP_points i.Year, fe vce(cluster individualplayer)
                  So basically this is what you are suggesting?
                  Code:
                  . encode Tournament, g(tournament)
                  
                  . encode Nationality, g(nationality)
                  
                  . encode IndividualPlayer, g(individualplayer)
                  
                  . encode Series, g(series)
                  
                  . xtset individualplayer
                  
                  Panel variable: individualplayer (unbalanced)
                  
                  . eststo: xtreg GameswonbyIndividualPlayer IndividualHeterogeneity IndividualPrizeSpread1000 NumOfTourneys SpreadNumOfTourneys i.series i.Year, fe vce(cluster individualplayer)
                  
                  Fixed-effects (within) regression               Number of obs     =     37,060
                  Group variable: individual~r                    Number of groups  =        654
                  
                  R-squared:                                      Obs per group:
                       Within  = 0.2600                                         min =          1
                       Between = 0.3750                                         avg =       56.7
                       Overall = 0.2665                                         max =        509
                  
                                                                  F(15,653)         =     482.02
                  corr(u_i, Xb) = 0.0457                          Prob > F          =     0.0000
                  
                                                    (Std. err. adjusted for 654 clusters in individualplayer)
                  -------------------------------------------------------------------------------------------
                                            |               Robust
                  GameswonbyIndividualPla~r | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
                  --------------------------+----------------------------------------------------------------
                    IndividualHeterogeneity |  -.0050127   .0003349   -14.97   0.000    -.0056703    -.004355
                  IndividualPrizeSpread1000 |   .0036579   .0015446     2.37   0.018     .0006248     .006691
                              NumOfTourneys |   .0316546    .006998     4.52   0.000     .0179134    .0453958
                        SpreadNumOfTourneys |  -.0003286   .0001106    -2.97   0.003    -.0005459   -.0001114
                                            |
                                     series |
                                    ATP500  |  -.4207453   .0729571    -5.77   0.000    -.5640041   -.2774866
                                Grand Slam  |   6.082482   .0918675    66.21   0.000      5.90209    6.262873
                              Masters 1000  |  -.5345235   .0627142    -8.52   0.000    -.6576693   -.4113778
                                            |
                                       Year |
                                      2014  |  -.0860815   .1077025    -0.80   0.424    -.2975666    .1254035
                                      2015  |   .0195076   .1085205     0.18   0.857    -.1935836    .2325988
                                      2016  |  -.0850902   .1201285    -0.71   0.479     -.320975    .1507946
                                      2017  |  -.0801476    .119842    -0.67   0.504    -.3154699    .1551746
                                      2018  |   .0699905   .1249721     0.56   0.576    -.1754052    .3153861
                                      2019  |  -.0760769   .1261894    -0.60   0.547     -.323863    .1717091
                                      2020  |  -.0611773   .1394157    -0.44   0.661    -.3349345    .2125798
                                      2021  |   -.262855   .1327267    -1.98   0.048    -.5234777   -.0022323
                                            |
                                      _cons |   11.33306   .1430278    79.24   0.000     11.05221    11.61391
                  --------------------------+----------------------------------------------------------------
                                    sigma_u |  2.5736778
                                    sigma_e |  4.6032884
                                        rho |  .23814618   (fraction of variance due to u_i)
                  -------------------------------------------------------------------------------------------
                  (est1 stored)
                  
                  . //second regression
                  . xtset series
                  
                  Panel variable: series (unbalanced)
                  
                  . eststo: xtreg GameswonbyIndividualPlayer IndividualHeterogeneity IndividualPrizeSpread1000 i.Year, fe vce(cluster series) 
                  
                  Fixed-effects (within) regression               Number of obs     =     37,060
                  Group variable: series                          Number of groups  =          4
                  
                  R-squared:                                      Obs per group:
                       Within  = 0.0290                                         min =      4,922
                       Between = 0.8877                                         avg =    9,265.0
                       Overall = 0.0245                                         max =     14,284
                  
                                                                  F(3,3)            =          .
                  corr(u_i, Xb) = 0.0173                          Prob > F          =          .
                  
                                                                (Std. err. adjusted for 4 clusters in series)
                  -------------------------------------------------------------------------------------------
                                            |               Robust
                  GameswonbyIndividualPla~r | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
                  --------------------------+----------------------------------------------------------------
                    IndividualHeterogeneity |  -.0064057   .0020956    -3.06   0.055    -.0130748    .0002634
                  IndividualPrizeSpread1000 |   .0008423   .0004683     1.80   0.170    -.0006482    .0023327
                                            |
                                       Year |
                                      2014  |   .0164121   .1327797     0.12   0.909    -.4061521    .4389764
                                      2015  |   .1969553   .1161253     1.70   0.188    -.1726073     .566518
                                      2016  |   .1140144     .10225     1.12   0.346    -.2113908    .4394197
                                      2017  |    .182168   .1280901     1.42   0.250    -.2254719    .5898079
                                      2018  |   .3684069   .0649014     5.68   0.011     .1618615    .5749522
                                      2019  |   .3215281   .1159868     2.77   0.069    -.0475936    .6906497
                                      2020  |   .2083712   .0691649     3.01   0.057    -.0117422    .4284847
                                      2021  |   .2084086   .1057499     1.97   0.143    -.1281348    .5449519
                                            |
                                      _cons |   12.78534   .0404254   316.27   0.000     12.65669      12.914
                  --------------------------+----------------------------------------------------------------
                                    sigma_u |  3.1630326
                                    sigma_e |  4.6670399
                                        rho |  .31475379   (fraction of variance due to u_i)
                  -------------------------------------------------------------------------------------------
                  (est2 stored)
                  
                  . //third regression
                  . eststo: xtreg Totalsumofgameswonpermatch Heterogeneity PrizeSpread1000 i.Year, fe vce(cluster series)
                  
                  Fixed-effects (within) regression               Number of obs     =     18,530
                  Group variable: series                          Number of groups  =          4
                  
                  R-squared:                                      Obs per group:
                       Within  = 0.0029                                         min =      2,461
                       Between = 0.0849                                         avg =    4,632.5
                       Overall = 0.0046                                         max =      7,142
                  
                                                                  F(3,3)            =          .
                  corr(u_i, Xb) = 0.0413                          Prob > F          =          .
                  
                                                      (Std. err. adjusted for 4 clusters in series)
                  ---------------------------------------------------------------------------------
                                  |               Robust
                  Totalsumofgam~h | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
                  ----------------+----------------------------------------------------------------
                    Heterogeneity |  -.0029569   .0013907    -2.13   0.123    -.0073827    .0014689
                  PrizeSpread1000 |   .0014016   .0009422     1.49   0.234    -.0015969       .0044
                                  |
                             Year |
                            2014  |    .030901   .2696784     0.11   0.916     -.827336    .8891379
                            2015  |   .3956033   .2415625     1.64   0.200    -.3731562    1.164363
                            2016  |   .2637342    .217123     1.21   0.311    -.4272479    .9547164
                            2017  |   .4004196   .2314982     1.73   0.182    -.3363109     1.13715
                            2018  |    .743855   .1397255     5.32   0.013     .2991861    1.188524
                            2019  |   .6214896   .2267497     2.74   0.071    -.1001292    1.343108
                            2020  |   .4046571   .1363044     2.97   0.059    -.0291243    .8384385
                            2021  |   .4309702   .2170917     1.99   0.141    -.2599126    1.121853
                                  |
                            _cons |   25.76912   .0699527   368.38   0.000      25.5465    25.99174
                  ----------------+----------------------------------------------------------------
                          sigma_u |  6.3429781
                          sigma_e |  7.5497757
                              rho |  .41378529   (fraction of variance due to u_i)
                  ---------------------------------------------------------------------------------
                  (est3 stored)
                  
                  . esttab
                  
                  ------------------------------------------------------------
                                        (1)             (2)             (3)   
                               Gameswon~yer    Gameswon~yer    Totalsumof~h   
                  ------------------------------------------------------------
                  Individual~y     -0.00501***     -0.00641                   
                                   (-14.97)         (-3.06)                   
                  
                  Individ~1000      0.00366*       0.000842                   
                                     (2.37)          (1.80)                   
                  
                  NumOfTourn~s       0.0317***                                
                                     (4.52)                                   
                  
                  SpreadNumO~s    -0.000329**                                 
                                    (-2.97)                                   
                  
                  1.series                0                                   
                                        (.)                                   
                  
                  2.series           -0.421***                                
                                    (-5.77)                                   
                  
                  3.series            6.082***                                
                                    (66.21)                                   
                  
                  4.series           -0.535***                                
                                    (-8.52)                                   
                  
                  2013.Year               0               0               0   
                                        (.)             (.)             (.)   
                  
                  2014.Year         -0.0861          0.0164          0.0309   
                                    (-0.80)          (0.12)          (0.11)   
                  
                  2015.Year          0.0195           0.197           0.396   
                                     (0.18)          (1.70)          (1.64)   
                  
                  2016.Year         -0.0851           0.114           0.264   
                                    (-0.71)          (1.12)          (1.21)   
                  
                  2017.Year         -0.0801           0.182           0.400   
                                    (-0.67)          (1.42)          (1.73)   
                  
                  2018.Year          0.0700           0.368*          0.744*  
                                     (0.56)          (5.68)          (5.32)   
                  
                  2019.Year         -0.0761           0.322           0.621   
                                    (-0.60)          (2.77)          (2.74)   
                  
                  2020.Year         -0.0612           0.208           0.405   
                                    (-0.44)          (3.01)          (2.97)   
                  
                  2021.Year          -0.263*          0.208           0.431   
                                    (-1.98)          (1.97)          (1.99)   
                  
                  Heterogene~y                                     -0.00296   
                                                                    (-2.13)   
                  
                  PrizeSp~1000                                      0.00140   
                                                                     (1.49)   
                  
                  _cons               11.33***        12.79***        25.77***
                                    (79.24)        (316.27)        (368.38)   
                  ------------------------------------------------------------
                  N                   37060           37060           18530   
                  ------------------------------------------------------------
                  t statistics in parentheses
                  * p<0.05, ** p<0.01, *** p<0.001

                  Comment


                  • #24
                    Luca:
                    yes.
                    However, I would check whether the functional form of your regressand is correct via a procedure that replicates the one described in -linktest- (but that does not work as a built- in command after -xtreg-).
                    Kind regards,
                    Carlo
                    (Stata 18.0 SE)

                    Comment


                    • #25
                      Originally posted by Carlo Lazzaro View Post
                      Luca:
                      yes.
                      However, I would check whether the functional form of your regressand is correct via a procedure that replicates the one described in -linktest- (but that does not work as a built- in command after -xtreg-).
                      Good afternoon Carlo,
                      When I add two dummies to the model, Favourite (=1 if player is favourite in the match, 0 otherwise) and Bestof (=3 if best-of-3 match is considered or 5 if best-of-5 match is considered, as I believe it impacts the decision of a player regarding to how much effort he is willing to exert), I get much better specified models (r-sq was improved massively). When I changed the panel variable to be Year instead of individualplayer, I got much better results as well. When I used series as the panel variable, it gave me poor results (very low r-sq). Can I simply run the three models with tournament fixed effects and nationality fixed effects (nationality fixed effects to check whether there are cutural differences). I think the results in this case are pretty good, but do you think it is okay to run it like that?
                      Code:
                      encode Tournament, g(tournament)
                      encode Nationality, g(nationality)
                      encode IndividualPlayer, g(individualplayer)
                      encode Series, g(series)
                      xtset Year
                      xtreg GameswonbyIndividualPlayer IndividualHeterogeneity IndividualPrizeSpread1000 NumOfTourneys SpreadNumOfTourneys Favourite i.series i.nationality, fe vce(cluster Year)
                      //second regression
                      xtset Year
                      xtreg GameswonbyIndividualPlayer IndividualHeterogeneity IndividualPrizeSpread1000 Favourite Bestof i.nationality i.series, fe vce(cluster Year)
                      //third regression
                      xtreg Totalsumofgameswonpermatch Heterogeneity PrizeSpread1000 Bestof i.series, fe vce(cluster Year)
                      table () ( command ) (), command(xtreg Totalsumofgameswonpermatch Heterogeneity PrizeSpread1000 Bestof i.series, fe vce(cluster Year)) command(xtreg GameswonbyIndividualPlayer IndividualHeterogeneity IndividualPrizeSpread1000 Favourite Bestof i.nationality i.series, fe vce(cluster Year)) command(xtreg GameswonbyIndividualPlayer IndividualHeterogeneity IndividualPrizeSpread1000 NumOfTourneys SpreadNumOfTourneys Favourite i.series i.nationality, fe vce(cluster Year))
                      collect label levels command 1 "Model 1" 2 "Model 2" 3 "Model 3", modify
                      If it's okay, I'll add another model which will be a logistic regression to the likelihood of the favourite player to win (odds ratio). But I first want to make sure that these models in the code are well specified.
                      Thanks.

                      Comment


                      • #26
                        Luca:
                        the main issue with you approach is that you cannot use -year- as a -panelid-.
                        And when you go -fe- the R-sq to look at is the within one.
                        Kind regards,
                        Carlo
                        (Stata 18.0 SE)

                        Comment


                        • #27
                          Originally posted by Carlo Lazzaro View Post
                          Luca:
                          the main issue with you approach is that you cannot use -year- as a -panelid-.
                          And when you go -fe- the R-sq to look at is the within one.
                          I don't know what to do. If I run it with series as -panelid-, I get a very low r-sq. If I run it with individualplayer as -panelid-, I get collinearity with i.nationality and I don't want to lose the country fixed effect (i.nationality).

                          Comment


                          • #28
                            Luca:
                            as nationality is time-invariant, the -fe- wipes it out as expected.
                            That said, I would stick with your second model, as it makes more sense than using -series- as a -panelid-.
                            Kind regards,
                            Carlo
                            (Stata 18.0 SE)

                            Comment


                            • #29
                              Originally posted by Carlo Lazzaro View Post
                              Luca:
                              as nationality is time-invariant, the -fe- wipes it out as expected.
                              That said, I would stick with your second model, as it makes more sense than using -series- as a -panelid-.
                              1) I don't get it. When I run the model with i.nationality, the fe doesnt wipe it out and actually returns coefficients for each country.
                              2) You said that I can't use Year as -panelid-, so I can't really stick with it.

                              Comment


                              • #30
                                Luca:
                                1) if you get a coefficient from -i.nationality- under -fe-, some player changed her/his nationality;
                                2) using -timevar- as -panelid- is simply wrong, no matter the results.
                                Kind regards,
                                Carlo
                                (Stata 18.0 SE)

                                Comment

                                Working...
                                X