Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Treatment Group

    Dear Statalist Users!

    I'm woking on my bachelor thesis and doing a panel data analysis on children's wellbeing in connection with micro credit systems.
    My data set is four rounds long and has 12079 observations total (the data examples are at the end of the post).

    Now I was looking at a way to distinguish the treatment and control group, but I'm having some issues. The variable IKP is giving me the information if a family had access to the micro credit programm IKP since last round.
    I'd like to construct my treatment in a way, that it can be distinguishable between rounds. So I can look at a treatment group which got treated only between round 2 and 3 (so IKP=1 in round 3) and another treatment group who only got treated between round 3 and 4 (so IKP=1 in round 4).

    I did this with some help of other statalist members and it goes as follows:
    (Switching the 4 with 3 in the first two lines should alter the treatment group for the above mentioned differentiation.)

    Code:
    by childid, sort: egen credit_period_4 = max(cond(round == 4, ikp, .))
    by childid: egen credit_other_periods = min(cond(round != 4, ikp, .))
    gen byte treatment = .
    replace treatment = 1 if credit_period_4 & !credit_other_periods
    replace treatment = 0 if !credit_period_4 & !credit_other_periods
    Now my problem is that if I'm calculating how many people are in the microcredit system (at any point) and observed in round 4 I get 1683 observations, looking at how many of these only receive this treatment in period 4 with my constructed variable I get 4,552, which logically cannot be true.

    Code:
    tabulate ikp if round==4
    
             hh |
      benefited |
       from ikp |
      programme |
     since last |
          round |      Freq.     Percent        Cum.
    ------------+-----------------------------------
             no |      1,142       40.42       40.42
            yes |      1,683       59.58      100.00
    ------------+-----------------------------------
          Total |      2,825      100.00
    
    
    tabulate treatment
    
      treatment |      Freq.     Percent        Cum.
    ------------+-----------------------------------
              0 |      3,844       45.78       45.78
              1 |      4,552       54.22      100.00
    ------------+-----------------------------------
          Total |      8,396
    Where is my mistake? I'm pretty sure the code above makes sense, but my results mustn't be correct.
    Thanks so much for your help!


    Best
    Arto Arman


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte childid
    1 "IN010001" .
    2 "IN010001" 0
    3 "IN010001" 0
    4 "IN010001" 0
    1 "IN010002" .
    2 "IN010002" 0
    3 "IN010002" 0
    4 "IN010002" 0
    1 "IN010003" .
    2 "IN010003" 0
    3 "IN010003" 0
    4 "IN010003" 1
    1 "IN010004" .
    2 "IN010004" 0
    3 "IN010004" 0
    4 "IN010004" 0
    1 "IN010005" .
    2 "IN010005" 0
    3 "IN010005" 0
    4 "IN010005" 0
    1 "IN010006" .
    2 "IN010006" 0
    3 "IN010006" 0
    4 "IN010006" .
    1 "IN010007" .
    2 "IN010007" 0
    3 "IN010007" 0
    4 "IN010007" 0
    1 "IN010008" .
    2 "IN010008" 1
    3 "IN010008" 1
    4 "IN010008" 1
    1 "IN010009" .
    2 "IN010009" 1
    3 "IN010009" 0
    4 "IN010009" 1
    1 "IN010010" .
    2 "IN010010" 0
    3 "IN010010" 1
    4 "IN010010" 0
    1 "IN010011" .
    2 "IN010011" 1
    3 "IN010011" 0
    4 "IN010011" 0
    1 "IN010012" .
    2 "IN010012" 0
    3 "IN010012" 0
    4 "IN010012" 0
    1 "IN010013" .
    2 "IN010013" 0
    3 "IN010013" 0
    4 "IN010013" 1
    1 "IN010014" .
    2 "IN010014" 1
    3 "IN010014" 1
    4 "IN010014" 1
    1 "IN010015" .
    2 "IN010015" 0
    3 "IN010015" 0
    4 "IN010015" 0
    1 "IN010016" .
    2 "IN010016" 0
    3 "IN010016" 0
    4 "IN010016" 1
    1 "IN010017" .
    2 "IN010017" 0
    3 "IN010017" 0
    4 "IN010017" 0
    1 "IN010018" .
    2 "IN010018" 0
    3 "IN010018" 0
    4 "IN010018" 0
    1 "IN010019" .
    2 "IN010019" 0
    3 "IN010019" 0
    4 "IN010019" 0
    1 "IN010020" .
    2 "IN010020" 0
    3 "IN010020" 0
    4 "IN010020" 0
    1 "IN010021" .
    2 "IN010021" 0
    3 "IN010021" 0
    4 "IN010021" 1
    1 "IN010022" .
    2 "IN010022" .
    3 "IN010022" .
    4 "IN010022" .
    1 "IN010023" .
    2 "IN010023" 0
    3 "IN010023" 0
    4 "IN010023" 1
    1 "IN010024" .
    2 "IN010024" 0
    3 "IN010024" 0
    4 "IN010024" 1
    1 "IN010025" .
    2 "IN010025" 0
    3 "IN010025" 0
    4 "IN010025" 0
    end
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte ikp
    .
    0
    0
    0
    .
    0
    0
    0
    .
    0
    0
    1
    .
    0
    0
    0
    .
    0
    0
    0
    .
    0
    0
    .
    .
    0
    0
    0
    .
    1
    1
    1
    .
    1
    0
    1
    .
    0
    1
    0
    .
    1
    0
    0
    .
    0
    0
    0
    .
    0
    0
    1
    .
    1
    1
    1
    .
    0
    0
    0
    .
    0
    0
    1
    .
    0
    0
    0
    .
    0
    0
    0
    .
    0
    0
    0
    .
    0
    0
    0
    .
    0
    0
    1
    .
    .
    .
    .
    .
    0
    0
    1
    .
    0
    0
    1
    .
    0
    0
    0
    end
    label values ikp ikp
    label def ikp 0 "no", modify
    label def ikp 1 "yes", modify
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte round
    1
    2
    3
    4
    1
    2
    3
    4
    1
    2
    3
    4
    1
    2
    3
    4
    1
    2
    3
    4
    1
    2
    3
    4
    1
    2
    3
    4
    1
    2
    3
    4
    1
    2
    3
    4
    1
    2
    3
    4
    1
    2
    3
    4
    1
    2
    3
    4
    1
    2
    3
    4
    1
    2
    3
    4
    1
    2
    3
    4
    1
    2
    3
    4
    1
    2
    3
    4
    1
    2
    3
    4
    1
    2
    3
    4
    1
    2
    3
    4
    1
    2
    3
    4
    1
    2
    3
    4
    1
    2
    3
    4
    1
    2
    3
    4
    1
    2
    3
    4
    end

  • #2
    Neither of those variables is appropriate for calculating the number of people who have had IKP access at any time. So both of those results are (probably) incorrect.

    Code:
    by childid, sort: egen ever_ikp = max(ikp)
    by childid: egen observed_round_4 = max(round == 4)
    gen ever_ikp_and_obs_round_4 = ever_ikp & observed_round_4
    by childid: gen flag = (_n == 1)
    count if ever_ikp_and_obs_round_4 & flag
    By the way, in the future, please do not create a separate data example for each variable. Do them all in one. When you post each variable separately, you force those who want to help you to put them back together again. And you had to run -dataex- three separate times. Just do -dataex- once with all of the relevant variables: less work for you, and less work for others.
    Last edited by Clyde Schechter; 13 Nov 2018, 08:18.

    Comment


    • #3
      I'm sorry about the dataex!

      Originally posted by Clyde Schechter View Post
      Neither of those variables is appropriate for calculating the number of people who have had IKP access at any time. So both of those results are (probably) incorrect.
      But I don't want them to have had IKP at any time, only in the time specified.
      So for period 4:

      - The treatment group can only be people who have access to IKP between period 3 and 4 (--> IKP==1 in round 4), but not have access to IKP in all other periods (--> IKP==0 for periods 1,2 and 3), because this could disturb my treatment in period 4 (eg. lagging effects from period 1 into period 4). Like this I can measure only the effect of treatment in period 4.
      - The control group does not have access to IKP in any period (so IKP==0 in all periods)


      For period 3:

      - The treatment group can only be people who have access to IKP between period 2 and 3 (--> IKP==1 in round 3), but not have access to IKP in all other periods (--> IKP==0 for periods 1,2 and 4).
      - Same control group.

      Comment


      • #4
        In #1, you said
        Now my problem is that if I'm calculating how many people are in the microcredit system (at any point) and observed in round 4 [italics added]
        and that is what the code in #2 will give you.

        I do not understand what you want.

        Comment


        • #5
          Originally posted by Clyde Schechter View Post
          I do not understand what you want.
          What you are quoting is the explanation of the code that follows and me trying to show why I suspect my code to be wrong, not what I wanted to code.

          What I would like to have is a code that splits my observations into groups mentioned in #3.
          Especially for period 3, as I'm not sure using my code will get me there.

          Comment


          • #6
            I found the answer, my code is working for treatment in period 4.
            For period 3 I have to alter it like this:

            Code:
            by childid, sort: egen credit_period_3 = max(cond(round == 3, ikp, .))
            by childid: egen credit_other_periods3 = max(cond(round != 3 & round !=4, ikp, .))
            gen treatment3 = .
            replace treatment3 = 1 if credit_period_3 & !credit_other_periods3
            replace treatment3 = 0 if !credit_period_3 & !credit_other_periods3
            Hereby I'm including people who got access to IKP in the 4th period, as there is no possiblity of a lagged effect in period 3 if they got "treated" in period 4.

            Comment

            Working...
            X