Unusual number of clusters

Sara Martins

Join Date: May 2018

Posts: 2
#1

Unusual number of clusters

04 May 2018, 09:35

Dear Statalist,

I have a cross-sectional dataset with 1.809 observations and 375 variables. I am also using Stata 14.1.

Data was collected at school-level and there is a total of 42 schools. I want to use clustered standard errors at school-level and since there are only 42 schools, the number of clusters should equal 42, correct?

However, when if I run:

Code:

ivregress 2sls y (x1 = z), first vce(cluster School_name)

The number of clusters in the first-stage is unusually high (it can go from 350 to 850, depending on the number of variables I add to the model). School_name is a numeric variable coded between 1 and 42, as follows:

PHP Code:

Nome da | escola: | Freq. Percent Cum. ------------+----------------------------------- 1.00 | 63 3.48 3.48 2.00 | 72 3.98 7.46 4.00 | 51 2.82 10.28 5.00 | 45 2.49 12.77 7.00 | 38 2.10 14.87 8.00 | 33 1.82 16.69 9.00 | 14 0.77 17.47 10.00 | 73 4.04 21.50 11.00 | 10 0.55 22.06 12.00 | 44 2.43 24.49 13.00 | 68 3.76 28.25 14.00 | 30 1.66 29.91 15.00 | 42 2.32 32.23 17.00 | 32 1.77 34.00 18.00 | 51 2.82 36.82 19.00 | 75 4.15 40.96 20.00 | 45 2.49 43.45 21.00 | 82 4.53 47.98 22.00 | 45 2.49 50.47 24.00 | 26 1.44 51.91 25.00 | 61 3.37 55.28 26.00 | 74 4.09 59.37 27.00 | 40 2.21 61.58 28.00 | 77 4.26 65.84 29.00 | 71 3.92 69.76 30.00 | 90 4.98 74.74 32.00 | 58 3.21 77.94 33.00 | 83 4.59 82.53 34.00 | 8 0.44 82.97 35.00 | 25 1.38 84.36 36.00 | 54 2.99 87.34 38.00 | 65 3.59 90.93 39.00 | 66 3.65 94.58 40.00 | 25 1.38 95.96 42.00 | 73 4.04 100.00 ------------+----------------------------------- Total | 1,809 100.00

Does anyone know what could be causing this jump in the number of clusters?

Thank you in advance.

Best regards,

Sara Martins
Tags: None
Roman Mostazir

Join Date: Apr 2014

Posts: 873
#2

04 May 2018, 12:37

Your data is not at school level. It is at individual level (lowest level) nested within school (upper level). We do not know who are your individuals nested within school, could be pupils/teachers or whatsoever. Your output suggests school-1 has 63 observations that belong to pupils/teachers i.e. at individual level, 2 has 72 observations and that way you have 1809 observations from 42 clusters (schools). Note, your number of cluster did not increase, it remained 42 as you mentioned. The frequency represents the number of observations within various schools being used for the estimation.

Roman
Comment
Sara Martins

Join Date: May 2018

Posts: 2
#3

07 May 2018, 01:33

Roman,

Thank you for clarifying.

Best regards,

Sara
Comment

Announcement

Unusual number of clusters

Comment

Comment