Hello, I'm working with Labour force surveys for 2019-2022 to run a comparative analysis between 2 sectors Health and Accommodation. This is what my data set looks like:
I need to define a variable that calculates total number of employed or unemployed or unemployment rate(unemployed/employed+unemployed) by sector and month .
Eg:
Health 2019m01 5%; Accommodation 2019m01 15%
Health 2019m02 7%; Accommodation 2019m02 25%
......
Health 2022m12 3%; Accommodation 2022m12 5%
I've already tried a few things from 'count if' command to foreach loops with increment counters, but nothing has been working. The Count if command shows me the numbers I need, but I have no idea how to assign it to a variable.
In essence I'm trying to obtain 2 time series graphs for each sector: one for atothrs with respect to date, and one for unemployment rate with respect to date. I've figured out how to do atothrs with respect to date, but haven't been able to figure out the unemployment time series.
Can someone please help me out here/ point me in the right direction?
This is my code and output so far:

Code:
* Example generated by -dataex-. For more info, type help dataex clear input int(naics_21 lfsstat) float(atothrs hrlyearn date) 17 1 45 46 708 17 1 30 . 708 17 1 36.3 37.5 708 17 2 0 40 708 19 1 35 18 708 17 1 33.7 42.5 708 17 1 35 17.75 708 17 2 0 21 708 19 1 36 14 708 17 1 40 18.8 708 17 1 17 . 708 17 1 40 28.41 708 19 2 0 12.15 708 17 1 56 50 708 19 1 40 18 708 17 1 37.5 36.26 708 17 2 0 22.63 708 17 2 0 31.25 708 17 1 31 37.84 708 17 1 40 25 708 19 3 40 12.5 708 17 3 24 23 708 17 3 40 . 708 end format %tm date label values naics_21 naics_21 label def naics_21 17 "Health care and social assistance", modify label def naics_21 19 "Accommodation and food services", modify label values lfsstat lfsstat label def lfsstat 1 "Employed, at work", modify label def lfsstat 2 "Employed, absent from work", modify label def lfsstat 3 "Unemployed", modify label values atothrs atothrs
I need to define a variable that calculates total number of employed or unemployed or unemployment rate(unemployed/employed+unemployed) by sector and month .
Eg:
Health 2019m01 5%; Accommodation 2019m01 15%
Health 2019m02 7%; Accommodation 2019m02 25%
......
Health 2022m12 3%; Accommodation 2022m12 5%
I've already tried a few things from 'count if' command to foreach loops with increment counters, but nothing has been working. The Count if command shows me the numbers I need, but I have no idea how to assign it to a variable.
In essence I'm trying to obtain 2 time series graphs for each sector: one for atothrs with respect to date, and one for unemployment rate with respect to date. I've figured out how to do atothrs with respect to date, but haven't been able to figure out the unemployment time series.
Can someone please help me out here/ point me in the right direction?
This is my code and output so far:
Code:
use "lfs_2019 (1).dta" append using "lfs_2020.dta" append using "lfs_2021.dta" append using "lfs_2022.dta" keep if !missing(survyear, survmnth) gen date= ym(survyear, survmnth) format date %tm sort date keep date naics_21 lfsstat hrlyearn atothrs drop if lfsstat==4 keep if naics_21 ==17| naics_21==19 egen hrswrked =sum(atothrs), by(naics_21 date) egen wage=mean(hrlyearn), by(naics_21 date) line hrswrked date, by(naics_21)