Hi there!
I am working with panel random effect model for unbalanced panel data and I am having difficulties in interpreting cross-sectional bootstrap. In particular, I would like to know how the following line of code works:
bootstrap Skewness_e=r(nuhat3) Kurtosis_e=r(nuhat4) Skewness_u=r(muhat3) Kurtosis_u=r(muhat4) , reps(`b') cluster(`panelvar') idcluster(`id') group(`time') : _xtsktest_calculations `varlist'
From STATA manual of bootstrap (https://www.stata.com/manuals/rbootstrap.pdf), I read that:
For instance, let us say I have this simple dataset (of size 11):
How one bootstrapped sample of size=11 is created? How `id' is created and `time' overridden?
My doubt is the following: do I have to draw panelvar, time or both? Because, if I draw panelvar, I would select all the times inside it and later define `id' to be unique therefore group option is useless... If I draw time, then override it to be unique and select all panelvar associated with it, `id' becomes useless.
Otherwise, let us say that I draw panelvar and inside each panelvar I draw time (see first two columns of example below). In my understanding, I create id and override time as shown below:
In this case, the time_overridden1 3 is not the same as time=3 for the other panelvar. Is this ok? Or do I have to go with time_overridden2?
Moreover, how can I assure the size to be the same as original dataset given that my panel is unbalanced?
(P.S. In my real data, panelvar is also a time variable)
Thank you in advance,
Edoardo
I am working with panel random effect model for unbalanced panel data and I am having difficulties in interpreting cross-sectional bootstrap. In particular, I would like to know how the following line of code works:
bootstrap Skewness_e=r(nuhat3) Kurtosis_e=r(nuhat4) Skewness_u=r(muhat3) Kurtosis_u=r(muhat4) , reps(`b') cluster(`panelvar') idcluster(`id') group(`time') : _xtsktest_calculations `varlist'
From STATA manual of bootstrap (https://www.stata.com/manuals/rbootstrap.pdf), I read that:
- bootstrap _b, cluster(cvar) idcluster(newcvar): myprog2 y x1 x2 x3 -> Resample clusters defined by cvar and create newcvar identifying resampled clusters
- group(varname) re-creates varname containing a unique identifier for each group across the resampled clusters. This option requires that idcluster() also be specified. This option is useful for maintaining unique group identifiers when sampling clusters with replacement. Suppose that cluster 1 contains 3 groups. If the idcluster(newclid) option is specified and cluster 1 is sampled multiple times, newclid uniquely identifies each copy of cluster 1. If group(newgroupid) is also specified, newgroupid uniquely identifies each copy of each group
- if size is not specified, the default is _N.
For instance, let us say I have this simple dataset (of size 11):
panelvar | time |
1 | 1 |
1 | 2 |
1 | 3 |
1 | 4 |
1 | 5 |
2 | 1 |
2 | 2 |
2 | 3 |
3 | 1 |
3 | 2 |
3 | 3 |
How one bootstrapped sample of size=11 is created? How `id' is created and `time' overridden?
My doubt is the following: do I have to draw panelvar, time or both? Because, if I draw panelvar, I would select all the times inside it and later define `id' to be unique therefore group option is useless... If I draw time, then override it to be unique and select all panelvar associated with it, `id' becomes useless.
Otherwise, let us say that I draw panelvar and inside each panelvar I draw time (see first two columns of example below). In my understanding, I create id and override time as shown below:
panelvar | time | id | time_overridden1 | time_overridden2 |
1 | 1 | 1 | 1 | 1 |
1 | 2 | 1 | 2 | 2 |
1 | 2 | 1 | 3 | 3 |
1 | 5 | 1 | 4 | 4 |
1 | 5 | 1 | 5 | 5 |
2 | 1 | 2 | 1 | 1 |
2 | 2 | 2 | 2 | 2 |
2 | 3 | 2 | 3 | 6 |
2 | 1 | 3 | 1 | 1 |
2 | 2 | 3 | 2 | 2 |
2 | 3 | 3 | 3 | 6 |
In this case, the time_overridden1 3 is not the same as time=3 for the other panelvar. Is this ok? Or do I have to go with time_overridden2?
Moreover, how can I assure the size to be the same as original dataset given that my panel is unbalanced?
(P.S. In my real data, panelvar is also a time variable)
Thank you in advance,
Edoardo