Hi everyone:
I would like to ask you the following.
I have a dataset, that is the outcome of a field experiment. It is formed of cross-sectional data: Two interviews took place in January 2017 and August 2017, in which the same individuals participated. The survey in Jan 2017 included socio-demographic variables, but apart from that both surveys had more or less the same questions regarding adoption habits.
During these months, an intervention took place, and the outcome variable (Y) is adoption and equals 1 if the individual adopted by August 2017 and 0 otherwise.
Five regions(R) were part of this intervention, adding up to 32 districts(D). [The treatment was randomized at the district level.] The names of these regions and districts are already in byte format.
Since it is cross-sectional data I would like to have "region" as the fixed effects level. Also, I need to cluster errors terms at the district level, since individuals are likely to be similar within a district than between districts.
The regression is as follows: Yidr = α + β0Tidr + γ1Xidr + γ2Wdr + Rr + eidr
Yidr is the dependent variable (1 or 0), Tidr is the treatment variable (0, 1 or 2), Xidr is a vector of individual-level variables, Wdr controls for commune-level variations and Rr is the region strata fixed effects.
Here I provide an example of my dataset (however the data is confidential and I cannot provide with more details. I hope this variables are enough to explain myself)
My question basically is
(1) Which command should I use in Stata 16 to run the regression above, accounting for Y being 1 or 0 (probit) and also including region fixed effects.
(2) Also, how can I create and include the vector of individual-level variables (X) (e.g. sex, education, children) and how can I add commune-level variations (W)?
I know it is a very long post, but I would be very grateful if someone helps me. I have been struggling a while but I am stuck.
Thanks in advance,
Maria
I would like to ask you the following.
I have a dataset, that is the outcome of a field experiment. It is formed of cross-sectional data: Two interviews took place in January 2017 and August 2017, in which the same individuals participated. The survey in Jan 2017 included socio-demographic variables, but apart from that both surveys had more or less the same questions regarding adoption habits.
During these months, an intervention took place, and the outcome variable (Y) is adoption and equals 1 if the individual adopted by August 2017 and 0 otherwise.
Five regions(R) were part of this intervention, adding up to 32 districts(D). [The treatment was randomized at the district level.] The names of these regions and districts are already in byte format.
Since it is cross-sectional data I would like to have "region" as the fixed effects level. Also, I need to cluster errors terms at the district level, since individuals are likely to be similar within a district than between districts.
The regression is as follows: Yidr = α + β0Tidr + γ1Xidr + γ2Wdr + Rr + eidr
Yidr is the dependent variable (1 or 0), Tidr is the treatment variable (0, 1 or 2), Xidr is a vector of individual-level variables, Wdr controls for commune-level variations and Rr is the region strata fixed effects.
Here I provide an example of my dataset (however the data is confidential and I cannot provide with more details. I hope this variables are enough to explain myself)
Region | District | Adoption (Y) | Treatment (T) | Sex | Education | Children | Risk aversion (Jan 17) |
Risk aversion (Aug 17) | Income (Jan 17) |
Income (Aug 17) |
1 | 1 | 0 | 0 | 1 | 4 | 3 | 1 | 1 | 1500 | 1500 |
1 | 3 | 1 | 1 | 0 | 1 | 0 | 0 | 1 | 1500 | 1750 |
1 | 4 | 1 | 1 | 0 | 2 | 1 | 0 | 0 | 2000 | 1500 |
2 | 5 | 0 | 1 | 0 | 3 | 2 | 1 | 0 | 1000 | 1200 |
2 | 6 | 1 | 0 | 1 | 0 | 1 | 1 | 0 | 700 | 750 |
2 | 8 | 1 | 1 | 0 | 1 | 1 | 0 | 1 | 1000 | 1000 |
3 | 10 | 1 | 2 | 1 | 6 | 2 | 1 | 1 | 1500 | 3000 |
3 | 11 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1000 | 0 |
3 | 12 | 1 | 2 | 1 | 2 | 2 | 1 | 1 | 1500 | 1500 |
4 | 14 | 0 | 1 | 1 | 3 | 3 | 0 | 0 | 0 | 1500 |
4 | 15 | 0 | 1 | 0 | 4 | 4 | 1 | 1 | 4000 | 4000 |
4 | 16 | 1 | 0 | 1 | 5 | 1 | 1 | 1 | 500 | 500 |
5 | 20 | 0 | 2 | 0 | 6 | 1 | 0 | 1 | 1000 | 1200 |
5 | 22 | 1 | 2 | 0 | 2 | 2 | 1 | 0 | 1000 | 1000 |
My question basically is
(1) Which command should I use in Stata 16 to run the regression above, accounting for Y being 1 or 0 (probit) and also including region fixed effects.
(2) Also, how can I create and include the vector of individual-level variables (X) (e.g. sex, education, children) and how can I add commune-level variations (W)?
I know it is a very long post, but I would be very grateful if someone helps me. I have been struggling a while but I am stuck.
Thanks in advance,
Maria
Comment