Hi. I have four excel files (OCEmr_Jan to Dec2022_GMRV, OCEmr_Jan to Dec2022_KAR, OCEmr_Jan to Dec2022_KVC, and OCEmr_Jan to Dec2022_MTC) in a folder with other excel files with different names. Each of these files has 12 sheets labelled by the month of the year (january, february, etc.). I want to convert and append each of these worksheets from each excel file to build a single dta file.
When I run the code within the foreach loop on an individual excel file, it works to extract each worksheet of excel to build a single dta file for each excel file. However, when I run the entire code (the master loop), it fails to do so for all the four excel files together
The code below is built to create four dta files, one per excel file. However, I also need to append these four dta files together and would like to do so in the loop.
Any guidance will be much appreciated
When I run the code within the foreach loop on an individual excel file, it works to extract each worksheet of excel to build a single dta file for each excel file. However, when I run the entire code (the master loop), it fails to do so for all the four excel files together
The code below is built to create four dta files, one per excel file. However, I also need to append these four dta files together and would like to do so in the loop.
Any guidance will be much appreciated
Code:
clear
tempfile building
save `building', emptyok
local file "KAR" "GMRV" "MTC" "KVC"
cd "C:\Users\scottr\Dropbox\LVPEI\Data\cleaning\raw data"
foreach f of local file {
import excel using "${db}\raw data\OCEmr_Jan to Dec2022_`file'.xlsx", describe
local n_sheets `r(N_worksheet)'
*return list local n_sheets `r(N_worksheet)'
forvalues j = 1/`n_sheets' {
local sheet`j' `r(worksheet_`j')'
}
forvalues j = 1/`n_sheets' {
import excel using "${db}\raw data\OCEmr_Jan to Dec2022_`file'.xlsx", sheet(`"`sheet`j''"') firstrow case(lower) clear
gen sheet = `j'
append using `building'
save `"`building'"', replace
}
sort sheet
*generating an age and sex var from the combined var
rename age Age
split Age, gen(age) parse("/")
replace age1 = "1" if ustrregexm(age1, "[0-9]{1,2} Mo.")
destring age1, replace
rename age1 age
rename age2 sex
drop Age
*splitting combined date-time var
gen double appt_dttm = clock(appttime, "DMYhm")
*assert missing(appt_dttm) == missing(appttime)
format appt_dttm %tc
gen appt_date = dofc(appt_dttm)
format appt_date %td
gen double appt_time_of_day = appt_dttm - cofd(appt_date)
format appt_time_of_day %tcHH:MM
*month and year from date
gen year = year(appt_date)
gen month = month(appt_date)
*replace missing values with .
rename district District
rename state State
encode District, gen(district)
encode State, gen(state)
drop District State appt_dttm
save "${db}\clean data\emr_`file'", replace
clear
}

Comment