Treatment Group

Arto Arman

Join Date: Sep 2018
Posts: 40

13 Nov 2018, 05:26

Dear Statalist Users!

I'm woking on my bachelor thesis and doing a panel data analysis on children's wellbeing in connection with micro credit systems.
My data set is four rounds long and has 12079 observations total (the data examples are at the end of the post).

Now I was looking at a way to distinguish the treatment and control group, but I'm having some issues. The variable IKP is giving me the information if a family had access to the micro credit programm IKP since last round.
I'd like to construct my treatment in a way, that it can be distinguishable between rounds. So I can look at a treatment group which got treated only between round 2 and 3 (so IKP=1 in round 3) and another treatment group who only got treated between round 3 and 4 (so IKP=1 in round 4).

I did this with some help of other statalist members and it goes as follows:
(Switching the 4 with 3 in the first two lines should alter the treatment group for the above mentioned differentiation.)

Code:

by childid, sort: egen credit_period_4 = max(cond(round == 4, ikp, .))
by childid: egen credit_other_periods = min(cond(round != 4, ikp, .))
gen byte treatment = .
replace treatment = 1 if credit_period_4 & !credit_other_periods
replace treatment = 0 if !credit_period_4 & !credit_other_periods

Now my problem is that if I'm calculating how many people are in the microcredit system (at any point) and observed in round 4 I get 1683 observations, looking at how many of these only receive this treatment in period 4 with my constructed variable I get 4,552, which logically cannot be true.

Code:

tabulate ikp if round==4

         hh |
  benefited |
   from ikp |
  programme |
 since last |
      round |      Freq.     Percent        Cum.
------------+-----------------------------------
         no |      1,142       40.42       40.42
        yes |      1,683       59.58      100.00
------------+-----------------------------------
      Total |      2,825      100.00


tabulate treatment

  treatment |      Freq.     Percent        Cum.
------------+-----------------------------------
          0 |      3,844       45.78       45.78
          1 |      4,552       54.22      100.00
------------+-----------------------------------
      Total |      8,396

Where is my mistake? I'm pretty sure the code above makes sense, but my results mustn't be correct.
Thanks so much for your help!

Best
Arto Arman

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input byte childid
1 "IN010001" .
2 "IN010001" 0
3 "IN010001" 0
4 "IN010001" 0
1 "IN010002" .
2 "IN010002" 0
3 "IN010002" 0
4 "IN010002" 0
1 "IN010003" .
2 "IN010003" 0
3 "IN010003" 0
4 "IN010003" 1
1 "IN010004" .
2 "IN010004" 0
3 "IN010004" 0
4 "IN010004" 0
1 "IN010005" .
2 "IN010005" 0
3 "IN010005" 0
4 "IN010005" 0
1 "IN010006" .
2 "IN010006" 0
3 "IN010006" 0
4 "IN010006" .
1 "IN010007" .
2 "IN010007" 0
3 "IN010007" 0
4 "IN010007" 0
1 "IN010008" .
2 "IN010008" 1
3 "IN010008" 1
4 "IN010008" 1
1 "IN010009" .
2 "IN010009" 1
3 "IN010009" 0
4 "IN010009" 1
1 "IN010010" .
2 "IN010010" 0
3 "IN010010" 1
4 "IN010010" 0
1 "IN010011" .
2 "IN010011" 1
3 "IN010011" 0
4 "IN010011" 0
1 "IN010012" .
2 "IN010012" 0
3 "IN010012" 0
4 "IN010012" 0
1 "IN010013" .
2 "IN010013" 0
3 "IN010013" 0
4 "IN010013" 1
1 "IN010014" .
2 "IN010014" 1
3 "IN010014" 1
4 "IN010014" 1
1 "IN010015" .
2 "IN010015" 0
3 "IN010015" 0
4 "IN010015" 0
1 "IN010016" .
2 "IN010016" 0
3 "IN010016" 0
4 "IN010016" 1
1 "IN010017" .
2 "IN010017" 0
3 "IN010017" 0
4 "IN010017" 0
1 "IN010018" .
2 "IN010018" 0
3 "IN010018" 0
4 "IN010018" 0
1 "IN010019" .
2 "IN010019" 0
3 "IN010019" 0
4 "IN010019" 0
1 "IN010020" .
2 "IN010020" 0
3 "IN010020" 0
4 "IN010020" 0
1 "IN010021" .
2 "IN010021" 0
3 "IN010021" 0
4 "IN010021" 1
1 "IN010022" .
2 "IN010022" .
3 "IN010022" .
4 "IN010022" .
1 "IN010023" .
2 "IN010023" 0
3 "IN010023" 0
4 "IN010023" 1
1 "IN010024" .
2 "IN010024" 0
3 "IN010024" 0
4 "IN010024" 1
1 "IN010025" .
2 "IN010025" 0
3 "IN010025" 0
4 "IN010025" 0
end

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input byte ikp
.
0
0
0
.
0
0
0
.
0
0
1
.
0
0
0
.
0
0
0
.
0
0
.
.
0
0
0
.
1
1
1
.
1
0
1
.
0
1
0
.
1
0
0
.
0
0
0
.
0
0
1
.
1
1
1
.
0
0
0
.
0
0
1
.
0
0
0
.
0
0
0
.
0
0
0
.
0
0
0
.
0
0
1
.
.
.
.
.
0
0
1
.
0
0
1
.
0
0
0
end
label values ikp ikp
label def ikp 0 "no", modify
label def ikp 1 "yes", modify

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input byte round
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
end

Tags: None

Clyde Schechter

Join Date: Apr 2014

Posts: 30066
#2

13 Nov 2018, 08:14

Neither of those variables is appropriate for calculating the number of people who have had IKP access at any time. So both of those results are (probably) incorrect.

Code:

by childid, sort: egen ever_ikp = max(ikp) by childid: egen observed_round_4 = max(round == 4) gen ever_ikp_and_obs_round_4 = ever_ikp & observed_round_4 by childid: gen flag = (_n == 1) count if ever_ikp_and_obs_round_4 & flag

By the way, in the future, please do not create a separate data example for each variable. Do them all in one. When you post each variable separately, you force those who want to help you to put them back together again. And you had to run -dataex- three separate times. Just do -dataex- once with all of the relevant variables: less work for you, and less work for others.

Last edited by Clyde Schechter; 13 Nov 2018, 08:18.
Comment
Arto Arman

Join Date: Sep 2018

Posts: 40
#3

13 Nov 2018, 08:41

I'm sorry about the dataex!

Originally posted by Clyde Schechter View Post

Neither of those variables is appropriate for calculating the number of people who have had IKP access at any time. So both of those results are (probably) incorrect.

But I don't want them to have had IKP at any time, only in the time specified.
So for period 4:

- The treatment group can only be people who have access to IKP between period 3 and 4 (--> IKP==1 in round 4), but not have access to IKP in all other periods (--> IKP==0 for periods 1,2 and 3), because this could disturb my treatment in period 4 (eg. lagging effects from period 1 into period 4). Like this I can measure only the effect of treatment in period 4.
- The control group does not have access to IKP in any period (so IKP==0 in all periods)

For period 3:

- The treatment group can only be people who have access to IKP between period 2 and 3 (--> IKP==1 in round 3), but not have access to IKP in all other periods (--> IKP==0 for periods 1,2 and 4).
- Same control group.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30066
#4

13 Nov 2018, 09:12

In #1, you said

Now my problem is that if I'm calculating how many people are in the microcredit system (at any point) and observed in round 4 [italics added]

and that is what the code in #2 will give you.

I do not understand what you want.
Comment
Arto Arman

Join Date: Sep 2018

Posts: 40
#5

14 Nov 2018, 02:36

Originally posted by Clyde Schechter View Post

I do not understand what you want.

What you are quoting is the explanation of the code that follows and me trying to show why I suspect my code to be wrong, not what I wanted to code.

What I would like to have is a code that splits my observations into groups mentioned in #3.
Especially for period 3, as I'm not sure using my code will get me there.
Comment
Arto Arman

Join Date: Sep 2018

Posts: 40
#6

14 Nov 2018, 08:24

I found the answer, my code is working for treatment in period 4.
For period 3 I have to alter it like this:

Code:

by childid, sort: egen credit_period_3 = max(cond(round == 3, ikp, .)) by childid: egen credit_other_periods3 = max(cond(round != 3 & round !=4, ikp, .)) gen treatment3 = . replace treatment3 = 1 if credit_period_3 & !credit_other_periods3 replace treatment3 = 0 if !credit_period_3 & !credit_other_periods3

Hereby I'm including people who got access to IKP in the 4th period, as there is no possiblity of a lagged effect in period 3 if they got "treated" in period 4.
Comment

Announcement

Treatment Group

Comment

Comment

Comment

Comment

Comment