Dear everyone,
I have been stuck on this particular issue for a while now, and I have decided to post here to see whether anyone might be able to help me out.
I am currently using data from 15 (yearly) waves of a panel, however, there was some inflow, so some participants have been added to the study later.
Currently, my data has been set as panel data using an id variable and a variable for 'wave' was created. The data is currently in long format.
The panel holds variables related to employment status, type of contract etc. and education.
I want to follow each participant who entered the labour market for 5 consecutive years (although it is worth mentioning that there is a lot of missingness in the dataset) and adjust for some variables at baseline (the year before they enter the labour market, timepoint 0). Since it is panel data with inflow and participants are aged 16 to 75+, the wave that someone enters the labour market may be different for each individual. I want to have 6 timepoints for each individual: timepoint 0 in which they have not entered the labour market and timepoint 1 being the year they enter the labour market. I want to include individuals who have information of their employment status etc. for at least 3 out of the 5 timepoints available (so from timepoint 1-5). I am not sure how I can go about this, however.
I just want to clean the data as I will be using Latent Gold for a LCA.
I am aware of the fact that I could use R, however, I have no experience with R yet, so I figured STATA might be faster.
Please do let me know your thoughts,
Best wishes,
Ciel
I have been stuck on this particular issue for a while now, and I have decided to post here to see whether anyone might be able to help me out.
I am currently using data from 15 (yearly) waves of a panel, however, there was some inflow, so some participants have been added to the study later.
Currently, my data has been set as panel data using an id variable and a variable for 'wave' was created. The data is currently in long format.
The panel holds variables related to employment status, type of contract etc. and education.
I want to follow each participant who entered the labour market for 5 consecutive years (although it is worth mentioning that there is a lot of missingness in the dataset) and adjust for some variables at baseline (the year before they enter the labour market, timepoint 0). Since it is panel data with inflow and participants are aged 16 to 75+, the wave that someone enters the labour market may be different for each individual. I want to have 6 timepoints for each individual: timepoint 0 in which they have not entered the labour market and timepoint 1 being the year they enter the labour market. I want to include individuals who have information of their employment status etc. for at least 3 out of the 5 timepoints available (so from timepoint 1-5). I am not sure how I can go about this, however.
I just want to clean the data as I will be using Latent Gold for a LCA.
I am aware of the fact that I could use R, however, I have no experience with R yet, so I figured STATA might be faster.
Please do let me know your thoughts,
Best wishes,
Ciel
Comment