Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Seek Help to Create Indicator Variables in Long Format Data in Stata.

    Hello, folks,
    The dataset below is just for illustrative purpose, although I've ever posted my question here,
    https://www.statalist.org/forums/for...rades-in-stata it didn't work for me, and my problem is not solved yet.
    This is a pressing task and I really want to move forward with my project.


    clear
    input str10 id byte (state year grade)
    001 0 1 0
    001 0 2 1
    001 0 3 2
    001 0 4 3
    001 0 5 4
    001 0 6 5
    001 0 7 6
    001 1 8 7
    001 0 9 8
    001 0 10 9
    001 0 11 10
    001 0 12 11
    001 0 13 12
    002 0 1 0
    002 0 2 1
    002 . 3 .
    002 0 4 3
    002 1 5 2
    002 1 6 3
    002 0 7 4
    002 0 8 5
    002 . 9 .
    002 0 10 7
    002 0 11 8
    002 0 12 9
    002 0 13 10
    003 0 1 0
    003 0 2 1
    003 1 3 1
    003 . 4 .
    003 0 5 3
    003 0 6 4
    003 . 7 .
    003 0 8 6
    003 0 9 7
    003 1 10 7
    003 0 11 8
    003 0 12 9
    003 0 13 10
    004 0 1 0
    004 0 2 1
    004 0 3 2
    004 0 4 3
    004 0 5 4
    004 0 6 5
    004 0 7 6
    004 0 8 7
    004 0 9 8
    004 0 10 9
    004 0 11 .
    004 1 12 8
    004 1 13 9
    005 0 1 0
    005 0 2 1
    005 1 3 1
    005 0 4 3
    005 0 5 4
    005 . 6 .
    005 0 7 6
    005 . 8 .
    005 1 9 6
    005 0 10 8
    005 0 11 10
    005 0 12 11
    005 1 13 11
    end

    The dataset has 5 students in total. The variable "state" indicates if a student was retained or demoted in a specific grade. (Here demoted students are regarded as student who was retained in grades. eg. student with id==004 is a demoted student when he/she demoted from 9th grade to 8th grade; also, he/she was retained in grades because she repeated 9th grade once).

    What I want is to create 3 binary variables to indicator if a student was retained in elementary school, middle school, and high school.
    The specific rule to create the variables above is listed below for reference,
    1) Within each id, as long as state==1 and grade<=5, then all values of the variable "elementary"==1.
    2) Within each id, as long as state==1 and grade ranged from 6 to 8, then all values of the variable "middle"==1.
    3) Within each id, as long as state==1 and grade ranged from 9 to 12, then all values of the variable "high"==1.
    4) Within each id, if state consists of missing values and zero only, then the corresponding stage indicator should be equal to "missing value".

    I really appreciate your sustained supports and thank you for your Stata code!
    Last edited by smith Jason; 30 Jul 2022, 11:24.

  • #2
    Sorry that I forgot the 5) within each id, if the missingness of the grade variable appeared on the boundary area (eg, missing values appeared on the grade variable between 4th grade and 6 grade or between 8th grade and 10th grade; or some pattern like this), then elementary/middle/high==missing value.

    Comment


    • #3
      What code have you tried so far

      Comment


      • #4
        Originally posted by Jared Greathouse View Post
        What code have you tried so far
        bys id: gen elementary=1 if grade<=5 & !missing(id, state, year)
        bys id: replace elementary=. if grade<=5 & missing(id, state, year)
        replace elementary=0 if elementary==.

        bys id: gen middle=1 if grade>=6 & grade<=8 & !missing(id, state, year)
        bys id: replace middle=. if grade>=6 & grade<=8 & missing(id, state, year)
        replace middle=0 if middle==.

        bys id: gen high=1 if grade>=9 & grade<=12 & !missing(id, state, year)
        bys id: replace high=. if grade>=9 & grade<=12 & missing(id, state, year)
        replace high=0 if high==.

        bys id: replace elementary=0 if state==0
        bys id: replace middle=0 if state==0
        bys id: replace high=0 if state==0

        The corrected data set is listed below because there is a typo with the ID=1 in the #1 thread (001 1 8 7=======>changed as 001 0 8 7),
        clear
        input str10 id byte (state year grade)
        001 0 1 0
        001 0 2 1
        001 0 3 2
        001 0 4 3
        001 0 5 4
        001 0 6 5
        001 0 7 6
        001 0 8 7
        001 0 9 8
        001 0 10 9
        001 0 11 10
        001 0 12 11
        001 0 13 12
        002 0 1 0
        002 0 2 1
        002 . 3 .
        002 0 4 3
        002 1 5 2
        002 1 6 3
        002 0 7 4
        002 0 8 5
        002 . 9 .
        002 0 10 7
        002 0 11 8
        002 0 12 9
        002 0 13 10
        003 0 1 0
        003 0 2 1
        003 1 3 1
        003 . 4 .
        003 0 5 3
        003 0 6 4
        003 . 7 .
        003 0 8 6
        003 0 9 7
        003 1 10 7
        003 0 11 8
        003 0 12 9
        003 0 13 10
        004 0 1 0
        004 0 2 1
        004 0 3 2
        004 0 4 3
        004 0 5 4
        004 0 6 5
        004 0 7 6
        004 0 8 7
        004 0 9 8
        004 0 10 9
        004 0 11 .
        004 1 12 8
        004 1 13 9
        005 0 1 0
        005 0 2 1
        005 1 3 1
        005 0 4 3
        005 0 5 4
        005 . 6 .
        005 0 7 6
        005 . 8 .
        005 1 9 6
        005 0 10 8
        005 0 11 10
        005 0 12 11
        005 1 13 11
        end
        Last edited by smith Jason; 30 Jul 2022, 12:37.

        Comment


        • #5
          bys id: gen elementary=1 if grade<=5 & !missing(id, state, year)
          bys id: replace elementary=. if grade<=5 & missing(id, state, year)
          replace elementary=0 if elementary==.

          bys id: gen middle=1 if grade>=6 & grade<=8 & !missing(id, state, year)
          bys id: replace middle=. if grade>=6 & grade<=8 & missing(id, state, year)
          replace middle=0 if middle==.

          bys id: gen high=1 if grade>=9 & grade<=12 & !missing(id, state, year)
          bys id: replace high=. if grade>=9 & grade<=12 & missing(id, state, year)
          replace high=0 if high==.

          bys id: replace elementary=0 if state==0
          bys id: replace middle=0 if state==0
          bys id: replace high=0 if state==0

          The corrected data set is listed below because there is a typo with the ID=1 in the #1 thread (001 1 8 7=======>changed as 001 0 8 7),
          clear
          input str10 id byte (state year grade)
          001 0 1 0
          001 0 2 1
          001 0 3 2
          001 0 4 3
          001 0 5 4
          001 0 6 5
          001 0 7 6
          001 0 8 7
          001 0 9 8
          001 0 10 9
          001 0 11 10
          001 0 12 11
          001 0 13 12
          002 0 1 0
          002 0 2 1
          002 . 3 .
          002 0 4 3
          002 1 5 2
          002 1 6 3
          002 0 7 4
          002 0 8 5
          002 . 9 .
          002 0 10 7
          002 0 11 8
          002 0 12 9
          002 0 13 10
          003 0 1 0
          003 0 2 1
          003 1 3 1
          003 . 4 .
          003 0 5 3
          003 0 6 4
          003 . 7 .
          003 0 8 6
          003 0 9 7
          003 1 10 7
          003 0 11 8
          003 0 12 9
          003 0 13 10
          004 0 1 0
          004 0 2 1
          004 0 3 2
          004 0 4 3
          004 0 5 4
          004 0 6 5
          004 0 7 6
          004 0 8 7
          004 0 9 8
          004 0 10 9
          004 . 11 .
          004 1 12 8
          004 1 13 9
          005 0 1 0
          005 0 2 1
          005 1 3 1
          005 0 4 3
          005 0 5 4
          005 . 6 .
          005 0 7 6
          005 . 8 .
          005 1 9 6
          005 0 10 8
          005 0 11 10
          005 0 12 11
          005 1 13 11
          end

          This is the correct data. Sorry.

          Comment


          • #6
            I'm having a hard time following since much of this isn't contextualized. Is this the result you want
            Code:
            clear
            input str10 id byte (state year grade)
            001 0 1 0
            001 0 2 1
            001 0 3 2
            001 0 4 3
            001 0 5 4
            001 0 6 5
            001 0 7 6
            001 0 8 7
            001 0 9 8
            001 0 10 9
            001 0 11 10
            001 0 12 11
            001 0 13 12
            002 0 1 0
            002 0 2 1
            002 . 3 .
            002 0 4 3
            002 1 5 2
            002 1 6 3
            002 0 7 4
            002 0 8 5
            002 . 9 .
            002 0 10 7
            002 0 11 8
            002 0 12 9
            002 0 13 10
            003 0 1 0
            003 0 2 1
            003 1 3 1
            003 . 4 .
            003 0 5 3
            003 0 6 4
            003 . 7 .
            003 0 8 6
            003 0 9 7
            003 1 10 7
            003 0 11 8
            003 0 12 9
            003 0 13 10
            004 0 1 0
            004 0 2 1
            004 0 3 2
            004 0 4 3
            004 0 5 4
            004 0 6 5
            004 0 7 6
            004 0 8 7
            004 0 9 8
            004 0 10 9
            004 . 11 .
            004 1 12 8
            004 1 13 9
            005 0 1 0
            005 0 2 1
            005 1 3 1
            005 0 4 3
            005 0 5 4
            005 . 6 .
            005 0 7 6
            005 . 8 .
            005 1 9 6
            005 0 10 8
            005 0 11 10
            005 0 12 11
            005 1 13 11
            end
            
            cls
            br
            bys id: gen elementary=1 if grade<=5 & !missing(id, state, year)
            bys id: replace elementary=. if grade<=5 & missing(id, state, year)
            replace elementary=0 if elementary==.
            
            bys id: gen middle=1 if grade>=6 & grade<=8 & !missing(id, state, year)
            bys id: replace middle=. if grade>=6 & grade<=8 & missing(id, state, year)
            replace middle=0 if middle==.
            
            bys id: gen high=1 if grade>=9 & grade<=12 & !missing(id, state, year)
            bys id: replace high=. if grade>=9 & grade<=12 & missing(id, state, year)
            replace high=0 if high==.
            
            bys id: replace elementary=0 if state==0
            bys id: replace middle=0 if state==0
            bys id: replace high=0 if state==0
            
            
            g indicator = 1 if state==1 & grade<=5
            
            replace indicator = 1 if state==1 & inrange(grade,6,8)
            
            replace indicator = 0 if state==1 & inrange(grade,9,12)
            
            replace indicator = 0 if inlist(state,0,.)

            Comment


            • #7
              Thank you! it is still not what I want. For example, for the student with id==2, when indicator==1, all values of the variable "elementary" should be equal to "1" as the grade<=5.
              Last edited by smith Jason; 30 Jul 2022, 14:31.

              Comment

              Working...
              X