Moulton Problem Example

Dimitriy V. Masterov

Join Date: Mar 2014
Posts: 609

Moulton Problem Example

17 Feb 2016, 20:09

I am trying to illustrate the famous Moulton variance inflation problem from ignoring clustering. I created a toy data set with 4 states of 20 observations, where each cluster has 2 types:

Code:

  +-------------------------------+
  |    y   x       e   state    n |
  |-------------------------------|
  | 5.35   0    -.65       0   10 |
  | 7.35   0    1.35       0   10 |
  |-------------------------------|
  |    3   1      -2       1   10 |
  |    5   1       0       1   10 |
  |-------------------------------|
  |    4   2       0       2   10 |
  |    6   2       2       2   10 |
  |-------------------------------|
  | 1.65   3   -1.35       3   10 |
  | 3.65   3     .65       3   10 |
  +-------------------------------+

I calculated the within cluster error correlation using:

Code:

xtreg e [fw=n], i(state) fe

From the formula on pp. 5-6 in http://cameron.econ.ucdavis.edu/rese...5_February.pdf, this implies that 1+1*0.415518*(20-1)=8.894842, so the standard errors should be (8.894842)^.5=2.9824222 times too big.

However, that does not seem to be the case when I compare the output:

Code:

reg y x [fw=n]
reg y x [fw=n], vce(cluster state)

For the curious, my code is:

Code:

set more off
/* fake data */
clear
set obs 8
gen state = mod(_n,4)
sort state 
gen x = state
gen y = 6 - 1*x
gen e = cond(mod(state,2)==1,-1,1) + cond(mod(_n,2),-1,1) + cond(state==0,-0.65,cond(state==3,0.65,0))
replace y = y+e
gen n=10

corr x e [fw=n]
xtreg e [fw=n], i(state) fe

list y x e state n, noobs sepby(state)
reg y x [fw=n]
reg y x [fw=n], vce(cluster state)

What am I doing wrong here?

Tags: clustered standard errors

Dimitriy V. Masterov

Join Date: Mar 2014

Posts: 609
#2

18 Feb 2016, 15:35

The Moulton variance inflation factor is 𝜏 ≃ 1 + 𝜌_{𝑥}*𝜌_{𝑢}*(𝑁̄_𝑔 − 1)
The pieces are
𝜌_{𝑥} is the within-cluster correlation of x. If there’s no variation within each cluster, as in our example, this will be one.

𝜌_{𝑢} is the within-cluster error correlation. Here it is 0.415518.

𝑁̄_𝑔 is the average cluster size. Here this is 20.

This relationship is exact when 𝜌_{𝑥} = 1 and all the clusters are the same size.

I also tried estimating the within error cluster correlation 𝜌_{𝑢} with

Code:

expand n loneway e state

This gets me 0.39792, which is pretty darn close to xtreg,fe. However, the implied standard error inflation does match the actual very well.
Comment
Ariel Karlinsky

Join Date: Jun 2015

Posts: 491
#3

13 Nov 2018, 01:40

Hi Dimitriy V. Masterov , did you get some sort of response for this? I'm interested as well.
Comment
Dimitriy V. Masterov

Join Date: Mar 2014

Posts: 609
#4

13 Nov 2018, 15:20

Ariel Karlinsky I never made much headway on this since posting. I wonder if the small-sample correction that Stata uses might be getting in the way, but I haven't verified that is the case.
1 like
Comment

Announcement

Moulton Problem Example

Comment

Comment

Comment