No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • cluster-robust standard errors are smaller than unclustered ones in fgls with cluster fixed effects

    I'm currently working on some experimental data. The experimental design consists of two treatments. In each treatment, 20 subjects are randomly matched in pairs and participate to a simple game. The game is repeated for 20 periods. In each period, the pairs are randomly re-matched and a single decision is made.
    I estimate the effect of the treatment with a model that includes individual random effects, session dummies and some lagged variables (to control for dynamic session effects). When I estimate the cluster-robust covariance matrix, with the xtreg_re option vce(cluster Session), the standard errors are smaller than the unclustered ones; when I exclude the session dummies, the cluster-robust standard errors become larger than the unclustered ones.
    I read the article on the comparison of the standard errors for robust, cluster, and standard estimators. I understand that there must be a cancellation of variation when the residuals are summed over clusters, but it's not clear to me why this happen when I include fixed effects for the clusters?

  • #2
    Perhaps I am misunderstanding your design, but I don't see how it can be analyzed using just -xtreg- even with a cluster-robust vce. If you are rematching the subjects into new pairs at the 20 periods, then you need crossed random effects to account for that in your analysis and, as far as I know, -xtreg- can't handle that. I know this doesn't answer your question, but if my point is correct, an analysis with crossed random-effects (presumably using -mixed-) might produce less surprising results.


    • #3
      Thank you for your reply.
      Unfortunately, I'm not familiar with crossed random effects and I'm not sure which levels you are referring to. In the experimental economics literature, I've never seen crossed random effects model used to deal with the "stranger protocol". It could be very helpful if you could detail your answer a little more.


      • #4
        Not being an economist, I don't know what's in the economics literature, but what I'm thinking about is this. You are taking subjects and pairing them up in each round of the experiment to play a game. So the model will be something like this:

        outcomejkt = constant + experimental variable(s) + covariate(s) + uj + uk + vt + epsjkt

        where outcomejkt denotes the outcome of the game when subject j plays with subject k in round t. You have variables describing the experimental conditions and (perhaps) covariates that are attributes of the participants or environmental conditions, etc. But the error term has 4 components. There are effects specific to both players. It is these effects that are crossed because each player will, over the 20 rounds, play the game with 20 different players. You also have effects due to time (which you say are lagged, but no matter, I'm just describing a time effect vt) and finally there is residual variance.

        My point is that xtreg, as far as I know, cannot include two different player effects, both of which are necessary for a good specification of the model. -mixed- can.

        Putting it in very concrete Stata terms, from your design description, I'm inferring that each observation includes a variable designating who is player 1 and another variable designating who is player 2. If you -xtset- your data with either of those variables, you are incorrectly omitting the effect of the other. That is why I don't think you can use -xtreg- for this.

        Of course, it is possible I have misunderstood your design altogether and am off base here.
        Last edited by Clyde Schechter; 17 Oct 2014, 15:03.


        • #5
          Cross-posted at

          Please note our policy on cross-posting, which is that you should tell us about it. Stack Exchange have a policy too.


          • #6
            First, I want to apologize with the statalist users if I cross-posted without any notification. I now realize why it is a bad practice.

            Clyde, thanks for your very detailed answer. Your suggestion is very appealing, but I think that in my case is not necessary for a good specification of the model, Indeed,in each period, once they have been paired, subjects don't know anything about their partners and take their decisions simultaneously. They receive some feedback only after the decision is made. Of course these feedback can have some effects on the following period, but that's why I have lagged variable to control for this kind of session effects.
            Tell me if you see any flaws in my reasoning.


            • #7

              No, I think the (standard in my field) terminology "crossed random effects" is confusing you. What you describe in your response clearly demonstrates that there is no interaction between the different player effects. But it is still true, as I understand it, that random effects from both players contribute to the outcome of the game. It is the ability to incorporate two different random effects from each observation that limits the applicability of -xtreg- to your design.

              Think about how you would program your analysis. Would you start with -xtset player1- or -xtset player2-? Either one would be an incomplete specification, but you cannot specify both.

              [The term "crossed random effects" does not refer to interaction: it refers to the co-occurrence of 2 (or more) random effects in the model that do not exhibit a nesting relationship. Well, it's not quite that, but for present purposes good enough.]


              • #8
                I think that the misunderstanding comes from dependent variable. I haven't been very specific on the matter. My dependent variable is the individual decision of each subject.
                Basically, when paired the subjects are asked to choose a number (to each number is associated a cost, the higher the number the higher the cost). After that, for each subject an individual random number (that can be positive or negative) is added to their chosen number to form a score. The comparison of the two scores determines the reward of the two subject. The two treatments differ in the reward stage.
                I'm interested in how the treatment affects the choice of the number. Given the condition under which this decision is taken, I don't think that the other player contribute to this choice.


                • #9

                  You are right. I had assumed that the outcome variable was some game result that depended on both players' actions. But given what you are doing, you have only a single player effect and can proceed. This, of course, does not answer your original questions--for which I don't have an answer.


                  • #10

                    Despite not answering the original question, thanks for your feedback... I wasn't familiar with crossed random effects model and your comments led me to look into it.

                    Concerning my original question, I think I tracked down the source of the problem. Indeed, what I observe with the standard errors is not specific to my data nor to the FGLS. I actually could replicate the problem with a fake panel and with standard OLS. I think that the source of the problem is my main independent variable, which is a dummy which takes a 1 if the observation is in the main treatment and 0 if it is in the control group. The session dummies that I want to plug into my model, to control for possible static session effects, are actually very correlated with the treatment dummy: each session belongs either to the main treatment or to the control treatment.
                    Nevertheless, I still am not sure how exactly the inclusion of the session dummies reduce my standard errors from the cluster robust covariance matrix and why I don't observe anything odd in the estimates of the parameters.


                    • #11

                      I am not sure whether the above discussion already answers my question?
                      I like to run a panel regression on a list of cities (80). However, out of different reasons FGLS procedure is recommended in the literature.
                      Now I have spent some time figuring out how I introduce FE in such a model? With xtreg I would just add fe at the end.
                      Is the only way to do this via dummy variables or is there some kind of code I can use to tell stata to do this?
                      Not sure if this question is clear enough.

                      Thanks in advanced for the help,
                      Last edited by Steffen Heinig; 08 Feb 2016, 06:00.