Hi,
I need some help expanding my data set.
I hope this introduction is not too long.
I have a dataset with monthly observation of people who receive different benefits and their labor market status between Jan2011 to Feb2018. The issue is that I only have observations for the months the person actually received a benefit or has a registered status. The number of monthly observations per individual is therefore not "balanced". I want to "balance" the data set so that I have the same number of observations per individual, and for the months there is no data on the individual, it should be missing. I want the data set to look something like this:
For example for individual 1 I only have their status in jan-feb 2011 and april-july2012:
While now it looks like this:
How do I do this? (Its a large dataset with 240 000 individuals)
Saliha
I need some help expanding my data set.
I hope this introduction is not too long.
I have a dataset with monthly observation of people who receive different benefits and their labor market status between Jan2011 to Feb2018. The issue is that I only have observations for the months the person actually received a benefit or has a registered status. The number of monthly observations per individual is therefore not "balanced". I want to "balance" the data set so that I have the same number of observations per individual, and for the months there is no data on the individual, it should be missing. I want the data set to look something like this:
For example for individual 1 I only have their status in jan-feb 2011 and april-july2012:
ID | Monthyear | benefit | workstatus | |
1 | 2011m1 | x | y | |
1 | 2011m2 | x | . | |
1 | 2011m3 | . | . | |
1 | 2011m4 | . | . | |
1 | 2011m5 | . | . | |
1 | and so on | |||
1 | 2012m4 | z | q | |
1 | 2012m5 | z | q | |
1 | 2012m6 | x | q | |
1 | 2012m7 | z | y | |
1 | 2012m8 | . | . | |
1 | 2012m9 | . | . | |
1 | and so on until | |||
1 | 2018m1 | . | . | |
1 | 2018m2 | . | . | |
2 | and so on |
id | monthyear | benefit | workstatus |
1 | 2011m1 | x | y |
1 | 2011m2 | x | . |
1 | 2012m4 | z | q |
1 | 2012m5 | z | q |
1 | 2012m6 | x | q |
1 | 2012m7 | z | y |
2 |
How do I do this? (Its a large dataset with 240 000 individuals)
Saliha
Comment