Hi All, I am new to panel data analysis. I am trying to analyse data involving attendance over 12 weeks program
subid progwk ynattend x2temp keep x3temp early_~d
1. 101 1 1 6 1 . .
2. 101 5 1 . 9 5 .
3. 102 1 1 2 1 . .
4. 102 5 1 . 9 8 .
5. 103 1 0 4 1 . .
I wanted to create a variable (early attender: last variable) to describe those who attended 5 weeks and left attending.
I created x2temp and x3temp to sum number of attendance over 12 weeks . I used 8 sessions in 5 weeks (5 weeks*2 sessions) to be a marker of good attendance.
I used following commands
egen x2temp = sum(ynattend)if progwk<5, by(subid)
bysort subid : gen keep = _n
egen x3temp = sum(ynattend)if progwk>=5, by(subid)
keep if keep==1 | keep==9
gen early_attend=.
replace early_attend=1 if x2temp>=8 & x3temp==0
However, It does not work as I wanted. Stata codes early attend=1 only where x3temp=0 ignoring x2temp values because x2temp values for that person is in different row.
How do we ask stata to look for values when observations are in two seperate rows because subject ID is repeated.
Am I doing the right way, what could be a easier way to do this.
I am also trying to find people who attended late weeks but did not attend early weeks
Thank you.
subid progwk ynattend x2temp keep x3temp early_~d
1. 101 1 1 6 1 . .
2. 101 5 1 . 9 5 .
3. 102 1 1 2 1 . .
4. 102 5 1 . 9 8 .
5. 103 1 0 4 1 . .
I wanted to create a variable (early attender: last variable) to describe those who attended 5 weeks and left attending.
I created x2temp and x3temp to sum number of attendance over 12 weeks . I used 8 sessions in 5 weeks (5 weeks*2 sessions) to be a marker of good attendance.
I used following commands
egen x2temp = sum(ynattend)if progwk<5, by(subid)
bysort subid : gen keep = _n
egen x3temp = sum(ynattend)if progwk>=5, by(subid)
keep if keep==1 | keep==9
gen early_attend=.
replace early_attend=1 if x2temp>=8 & x3temp==0
However, It does not work as I wanted. Stata codes early attend=1 only where x3temp=0 ignoring x2temp values because x2temp values for that person is in different row.
How do we ask stata to look for values when observations are in two seperate rows because subject ID is repeated.
Am I doing the right way, what could be a easier way to do this.
I am also trying to find people who attended late weeks but did not attend early weeks
Thank you.
Comment