Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Robust or Clustered Standard Errors

    Dear all,

    I am currently examining the impact of annual average sunset time on sleep duration of children in 4 developing countries.
    I have a panel data set over 3 years (2009, 2013 and 2016).
    The variable "sleep" denotes the hours per day allocated to sleep by child i in country c in studysite s at time t.
    The variable "annual average sunset" only varies at studysite level, so it denotes the average annual sunset time in studysite s in country c.

    I ran the following regressions:

    *OLS
    eststo m1: regress sleep annual_avg_sunset if in_model_3==1
    estadd local fe No
    estadd local fe_ No

    *OLS with control variables
    eststo m2: regress sleep annual_avg_sunset age wi hhsize typesite elec i.year, vce(robust)
    estadd local fe Yes
    estadd local fe_ No

    *FE(country&year with robust SE)
    eststo m3: regress sleep annual_avg_sunset age wi hhsize typesite elec i.year i.country, vce(robust)
    estadd local fe Yes
    estadd local fe_ Yes

    *FE(country&year SE clustered at the studysite year level)
    eststo m4: regress sleep annual_avg_sunset age wi hhsize typesite elec i.year i.country, vce(cluster studysite_year)
    estadd local fe Yes
    estadd local fe_ Yes

    Click image for larger version

Name:	stata outupt.png
Views:	1
Size:	25.7 KB
ID:	1490549




    Model (3) uses robust standard errors, Model (4) uses clustered standard errors at the studysite_year level. In Model (4) my coefficient on annual average sunset time becomes insignificant and I get a large standard error.

    Now I am wondering if I should cluster standard errors and if so, at what level. Does it make sense to cluster it at the studysite_year level in my exampl

    Thank you,

    Barbara

  • #2
    Barbara:
    just one step behind: if you have panel data with continuous regressand, why using -regress- as your first choice when -xtreg- is available?
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Hi Carlo,

      I did use xtreg when I wanted to include child fixed effects. Then I did set my data xtset child_ID year. (I have a panel data over 3 years for N= 19134 observations)
      When I wanted to include only age or country fixed effects I used the dummy variable method since I cannot set my data xtset country year...I tried to do this but stata gave me the following error message: repeated time values within panel

      I am very new to Stata, so I thought pooled OLS with the dummy variable method might be the right thing to do if I want to include those fixed effects in my regressions.

      Best,
      Barbara

      Comment


      • #4
        Barbara:
        provided that you do not have genuine duplicates (ie, mistaken data entry) and you do not plan to use time-series related commands, such as lags and leads, you can -xtset- your data with -panelid- only.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Thank you for your reply, Carlo.
          I will do that.

          Can you tell me if the regressions (1) -(4) are ok and if I should use robust standard errors or cluster them?

          Best,
          Barbara

          Comment


          • #6
            Barbara:
            I'm not clear whether you measured the same sample of children during a three year timespan or not.
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment


            • #7
              Yes, I did.
              I have three observations for every child, one for 2009, one for 2012 and one for 2016.

              Comment


              • #8
                Barbara:
                thanks for providing further details.
                Some comments about your query:
                - as you have N>T panel dataset, -xtreg- should be your first choice;
                - if you detect heteroskedasticity and/or autocorrelation in your data, I would run the following code:
                Code:
                xtset children year
                xtreg sleep annual_avg_sunset age wi hhsize typesite elec i.year i.country, vce(robust)
                -robust- impose the cluster-robust standard errors on -panelid- (as it should usually be the way to go).
                Kind regards,
                Carlo
                (Stata 19.0)

                Comment


                • #9
                  Thank you so much for your help, Carlo!

                  Actually, I do not really understand the difference between these two codes:

                  (1) regress sleep annual_avg_sunset age wi hhsize typesite elec i.year i.country, vce(robust)

                  (2) xtset childid year
                  xtreg sleep annual_avg_sunset age wi hhsize typesite elec i.year i.country, vce(robust)


                  Could it be that with code (1), Stata doesn't treat the dataset as panel data? But both regressions include year- and country-fixed effects, right?

                  Best,
                  Barbara


                  Comment


                  • #10
                    Barbara:
                    your first code you actually runs a pooled OLS, which is not the first choice when you have panel dataset.
                    The code I suggested runs a linear panel data regression with random effect and cluster robust standard error.
                    I would recommend you to take a look at -xtreg- entry in Stata .pdf manual and at this valuable textbook for Stata user dealing with econometrics: https://www.stata.com/bookstore/micr...metrics-stata/
                    Kind regards,
                    Carlo
                    (Stata 19.0)

                    Comment


                    • #11
                      May I ask why you chose a regression with random effects?

                      I just performed both, a fixed effects and a random effects regression.

                      xtset childid year
                      xtreg sleep annual_avg_sunset age wi hhsize typesite elec i.year i.country, re
                      xtreg sleep annual_avg_sunset age wi hhsize typesite elec i.year i.country, fe

                      According to the Hausman test, I should go with fixed effects.

                      Comment


                      • #12
                        Barbara:
                        just to give you an example.
                        Go -fe- if -hausman- points you out to it.
                        I see that you used default standard errors in your codes.
                        Hence, I assume that you did not detect heteroskedasticity and/or autocorrelation in your panel dataset.
                        However, if you want to compare -fe- vs -re- with non-default standard errors (as you cannot go -hausman- with default standard errors and then invoke non-default standard errors thereafter), you can rely on the community-contributed command -xtoverid- (that, being a bit old-fashioned, does not allow -fvvarlist- notation. A feasible trick is to prefix your code with -xi:-).
                        Kind regards,
                        Carlo
                        (Stata 19.0)

                        Comment


                        • #13
                          Dear Carlo
                          Sincere apologies for opening this thread and directly addressing the question to you, but I guess you may be able to clear my doubts which is based on some your comments. In this post,#8 you mentioned that,
                          if you detect heteroskedasticity and/or autocorrelation in your data, I would run the following code:
                          And the commands you gave is
                          Code:
                          xtset children year
                          xtreg sleep annual_avg_sunset age wi hhsize typesite elec i.year i.country, vce(robust)
                          Based on my learning from this forum this code is similar and identical to
                          Code:
                          xtreg sleep annual_avg_sunset age wi hhsize typesite elec i.year i.country, vce(cluster children)
                          where childer is panelvar.
                          Now my doubt is since vce option gives robust standard errors it accounts for heteroskedaticity too? Right,
                          In the link, https://www.statalist.org/forums/for...th-vce-cluster
                          @Carlo #4, you have mentioned
                          under -regress-, -vce(robust)- accounts for hetreoskedasticity in residual distribution, whereas -vce(cluster)- accounts for residual autocorrelation.
                          .
                          Ofcourse one is pooled ols and the other is Panel reg but still doesnt vce option accounts both heteroskedasticity and autocorrelation

                          Comment


                          • #14
                            Ial:
                            1) -regress-: if you detect both heteroskedasticity and autocorrelation, you should go -vce(cluster);
                            2) -xtreg-: if you detect both heteroskedasticity and/or autocorrelation, you can go -robust- or -vce(cluster panelid).
                            Kind regards,
                            Carlo
                            (Stata 19.0)

                            Comment


                            • #15
                              Ok Carlo, thanks for your answer

                              Comment

                              Working...
                              X