Cluster bootstrap and singleton cluster error

Chiara Cavaglia

Join Date: Nov 2020

Posts: 5
#1

Cluster bootstrap and singleton cluster error

11 Nov 2020, 06:03

Hello, I am trying to run a cluster bootstrap of a user written program:
bootstrap, reps(99) cluster(idn) strata(idn) seed(12345): command, ...

Where I have a panel of individuals over time and idn is a geogrpahical local area indicator.

I get the "singleton cluster detected" error.

I am now trying to create a programme using bsample and simulate and see whether this way I can make the program drop the singleton.

As I am not an expert programmer, however, I wonder wherther there is a more straightforward way (maybe using bootstrap?). I wonder if someone can help me and/or point me to the right resources.

Thanks in advance!!
Chiara
Tags: None
Felix Bittmann

Join Date: Aug 2018

Posts: 702
#2

11 Nov 2020, 06:18

Can you show the entire command or do file you are using? These errors happen usually in panel settings when the model is not correctly specified in the command option. If you are using a regular model, say, xtreg, you should use the vce(bootstrap) option instead of the prefix. If it is user written you may need to adapt it to handle newly created IDs, see the options in boostrap, both cluster and idcluster.

Best wishes

Stata 18.0 MP | ORCID | Google Scholar
Comment
Chiara Cavaglia

Join Date: Nov 2020

Posts: 5
#3

11 Nov 2020, 09:37

Hi Felix,
Thanks for your reply.
I attach part of the code. It is a two stage user written command. I only attach the first stage, as even this does not work (so no point in adding the second part).

cap program drop firststage
program define firststage, eclass
syntax, [x1(varlist)] [y1(varlist)] [condition1(string)] [clustid1(varlist)]
cap drop yhat*
reg `y1' idn#c.tax_year if `condition1', cluster(`clustid1')
predict yhat, xb
end

xtset, clear
gen newid=idn
gen strata=idn*(year<2008)
bootstrap, reps(2) cluster(idn) idcluster(newid) strata(strata) seed(141120): firststage, y1(y) condition1(year<2008) clustid1(idn)

I have tried several variations:
1) for example by trying to see whether something changes if I drop the time condition, but nothing changes:
bootstrap, reps(2) cluster(idn) idcluster(newid) strata(idn) seed(141120): firststage, y1(y) clustid1(idn)

2) or another attempt:
cap drop strata
egen strata=group(idn tax_year)
bootstrap, reps(2) cluster(idn) idcluster(newid) strata(strata) seed(141120): firststage, y1(y) clustid1(idn)

I always get the "singleton cluster detected" error.

I should specify that I have a panel of individual over tax_year, but the cluster variable is at the local area level.

Thanks!
Best,
Chiara
Comment
Felix Bittmann

Join Date: Aug 2018

Posts: 702
#4

11 Nov 2020, 10:15

The code is quite dense and without the data and background it will be difficult to understand what is going on in detail. Since you do not return any specific values I assume that you work with the returns of the reg command. What I think might be the problem is with newid. You specify this as a new id but this variable is then never used in your actual code. This cannot be, since this new variable must be used instead. What happens if you run

Code:

cap program drop firststage program define firststage, eclass syntax, [x1(varlist)] [y1(varlist)] [condition1(string)] [clustid1(varlist)] cap drop yhat* reg `y1' idn#c.tax_year if `condition1', cluster(`clustid1') predict yhat, xb end xtset, clear //Why is this even here? there are no xt commands used gen strata=idn*(year<2008) bootstrap, reps(2) cluster(idn) idcluster(newid) strata(strata) seed(141120): firststage, y1(y) condition1(year<2008) clustid1(newid)

I am not sure what you want to achieve in the end so my idea here might be wrong. Probably you also might want to set
reg `y1' idn#c.tax_year
to
reg `y1' newid#c.tax_year

Last edited by Felix Bittmann; 11 Nov 2020, 10:18.

Best wishes

Stata 18.0 MP | ORCID | Google Scholar
Comment
Chiara Cavaglia

Join Date: Nov 2020

Posts: 5
#5

11 Nov 2020, 10:33

Thanks Felix! I have tried what you suggest but unfortunately it does not work. I still get the same error.
Would there be a way to tell bootstrap to ignore/drop singletons? Or do you think it is just a problem of mispsecification of some sort in my code?
For a given idn (local area) and tax_year I have different number of individual observations, if that helps.

With firststage I am trying to predict the linear trends in a given local area (idn) in y in the pre-treatment period (before 2008).
I use this to predict the trends in each local area in the whole period (that, also after 2008)
I will then use yhat as a covariate in the second stage regression, where I wil do an event study on the effect of the "treatment" on y.

Sorry about the code. I cannot post the data but if helpful I could create a dataset where I fidn the same issue and then post the data?

Best,
Chiara

Last edited by Chiara Cavaglia; 11 Nov 2020, 10:40.
Comment
Felix Bittmann

Join Date: Aug 2018

Posts: 702
#6

12 Nov 2020, 00:30

Of course you can drop clusters with a single observation before the analyses like the following:

Code:

bysort clustervar: gen counter = _n bysort clustervar: egen cluster_n = max(counter) drop if cluster_n == 1

However, I am not sure if this solves all your problems. For me it is at this point not really clear what the program does and whether it is specified correctly.

Best wishes

Stata 18.0 MP | ORCID | Google Scholar
Comment
Chiara Cavaglia

Join Date: Nov 2020

Posts: 5
#7

12 Nov 2020, 09:50

Thanks Felix. No, unfortunately that does not solve it (I had tried it like this, too - with _N). Thanks for your help! If I manage to understand what is wrong I will post it here.
Best wishes,
Chiara
Comment
Chiara Cavaglia

Join Date: Nov 2020

Posts: 5
#8

12 Nov 2020, 12:15

In the end the variable in strata() was wrongly specified (in fact it did not make much sense the way I had specified it)!! Together with the fact that I had not put the newid variable as the cluster variable, as you mentioned. Thanks for stressing the fact that most time the issue is due to misspecification in the cluster variables!
Best,
Chiara
Comment

Announcement

Cluster bootstrap and singleton cluster error

Comment

Comment

Comment

Comment

Comment

Comment

Comment