Hi,
I am trying to determine the number of times maternity leave (matleave1) has been taken by individual (pidp) in a large data set and ideally I want a table showing me how many people have taken leave one, twice etc. The matleave1 variable takes the value of 1 if the individual is currently on maternity leave and 0 if they are not.
I have the below code but I know there is an issue as I am getting a larger number of total maternity leaves that what I see as the original variable.
by pidp, sort: summarize matleave1 if matleave1 ==1
sort pidp
bysort pidp: gen n_observations = sum(matleave1 == 1)
tabulate n_observations

I am trying to determine the number of times maternity leave (matleave1) has been taken by individual (pidp) in a large data set and ideally I want a table showing me how many people have taken leave one, twice etc. The matleave1 variable takes the value of 1 if the individual is currently on maternity leave and 0 if they are not.
I have the below code but I know there is an issue as I am getting a larger number of total maternity leaves that what I see as the original variable.
by pidp, sort: summarize matleave1 if matleave1 ==1
sort pidp
bysort pidp: gen n_observations = sum(matleave1 == 1)
tabulate n_observations
Comment