Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Running regression including fixed effects and clustered standard errors

    Hello everyone,

    i have panel data on a monthly basis over a time span of 14 years for 26 cities. Simply speaking, each monthly observation contains information about a respective city, the date of observation (e.g. "2014m1"), the temperature (ab_temp), as well as the Google Search Volume (DSVI_city) of that city in the given month.

    I want to regress DSVI_city on ab_temp. I am told to:

    1) Include year-month fixed effects in the regression
    2) Cluster standard errors by city and year-month

    I tried the following code:


    Code:
    . xtset city_id date
    
    Panel variable: city_id (unbalanced)
     Time variable: date, 2004m1 to 2017m12, but with gaps
             Delta: 1 month
    
    . xtreg DSVI_city ab_temp, fe vce(cluster city_id)
    
    Fixed-effects (within) regression               Number of obs     =      3,588
    Group variable: city_id                         Number of groups  =         26
    
    R-squared:                                      Obs per group:
         Within  = 0.0004                                         min =         42
         Between = 0.0061                                         avg =      138.0
         Overall = 0.0004                                         max =        168
    
                                                    F(1,25)           =       3.54
    corr(u_i, Xb) = 0.0010                          Prob > F          =     0.0716
    
                                   (Std. err. adjusted for 26 clusters in city_id)
    ------------------------------------------------------------------------------
                 |               Robust
       DSVI_city | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
         ab_temp |   .0038789   .0020617     1.88   0.072    -.0003673    .0081252
           _cons |  -.0020442   .0005277    -3.87   0.001    -.0031311   -.0009573
    -------------+----------------------------------------------------------------
         sigma_u |   .0168631
         sigma_e |  .59288721
             rho |  .00080831   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------
    So far, I think that I only clustered the standard errors by city and not by city and year-month.
    My line of reasoning was that I thought it wouldn't make sense to cluster the standard errors by city and year-month, as this would give me as many clusters as observations (as for each city, there is only one unique year-month observation)...


    Given on the information provided, could anyone be so kind and provide me with some feedback whether I implemented the code correctly?
    (I am still quite new to Stata and already went through many different posts and explanations, but I am not sure if I understood everything properly...)

    Thanks a lot in advance for your time and support-it is highly appreciated!

    Samuel







  • #2
    Samuel:
    some comments about your post:
    1) your code is correct, but your regression model needs more predictors (the withnin R-sq is alarming low; the F test tells that your model is not more informative the the mean of the regresssand).
    More substantively, you have a T>N panel dataset: hence -xtreg- is outperformed by -xtregar- or -xtgls-;
    2) the advice you received about clustering your standard errors (SEs) also on -i.month_year- sounds obscure and I would skip it altogether.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Hi Sam,

      To include year-fixed effects you would need to include "i.date":

      Code:
       
       xtreg DSVI_city ab_temp i.date, fe vce(cluster city_id)
      I agree with you that I don't think it would make sense to cluster by city and year-month but I could be wrong on this.

      Best,
      Rhys

      Comment


      • #4
        You have not included year/month fixed effects. You have included city fixed effects.

        There is also two way clustering, see this thread here
        https://www.statalist.org/forums/for...ols-regression

        Comment


        • #5
          I think the easiest thing for you to do, is what posters #11 and #12 suggest on https://www.statalist.org/forums/for...ols-regression

          Something like

          Code:
           
           reghdfe DSVI_city ab_temp, absorb(city_id date)  cluster(city_id date)

          Comment


          • #6
            Dear Joro,

            thanks a lot for very helpful post and sorry for my late response!

            Please allow me one short follow-up question:

            As I only have to include year-month fixed effects, I would use:

            Code:
            reghdfe DSVI_city ab_temp, absorb(date) cluster(city_id date)
            Am I correct?

            Best regards,

            Samuel

            Comment


            • #7
              Yes Samuel, this sounds correct.

              This puts date fixed effects, and does two way clustering by date and city.

              Originally posted by Samuel Pfaff View Post
              Dear Joro,

              thanks a lot for very helpful post and sorry for my late response!

              Please allow me one short follow-up question:

              As I only have to include year-month fixed effects, I would use:

              Code:
              reghdfe DSVI_city ab_temp, absorb(date) cluster(city_id date)
              Am I correct?

              Best regards,

              Samuel

              Comment


              • #8
                perfect, thank you Joro!

                Comment

                Working...
                X