I am trying to figure out the best way to create new variable names from filenames. The files have the same naming pattern but not the same number of sub characters within a pattern - which is where I am hitting a snag.
As an example, the filenames are
1. prev_qrtr_allnatlb_census2001.dta
2. prev_qrtr_metro_census2001.dta
3. prev_qrtr_all_census2001.dta
and so on.
Each file has four main variables: credit, deposit, office, and cflag2001
Since I want to ultimately merge all the files, I am trying to create a loop that would allow me to change the varnames to
1. credit_allnatlb, deposit_allnatb, office_allnatb
2. credit_metro, deposit_metro, office_metro
3. credit_all, deposit_all, office_all
I have displayed the do-file below. I am trying to use regex commands but obviously, there is something wrong with the syntax. Any advice would be much appreciated.
local filelist : dir "C:/Research/RBI data/Statement 4A/" files "*.dta"
foreach f of local filelist {
use "`f'"
destring credit* deposit*, replace ignore(`","')
reshape long office deposit credit, i( statedist) j(_quarter) string
gen quarter = quarterly(_quarter, "YQ")
format quarter %tq
drop _quarter
*filename pattern:
local j= regexs(3) if regexm("`f'","([a-zA-Z]+)[_]([a-zA-Z]+)[_]([a-zA-Z]+)[_]([a-zA-Z]+)([0-9]+)[.]([a-zA-Z]+)")
rename credit credit_"`j'"
rename deposit deposit_"`j'"
rename office office_"`j'"
rename cflag2001 cflag2001_"`j'"
save "C:/Research/data/inter/`f'", replace
}
Thanks.
As an example, the filenames are
1. prev_qrtr_allnatlb_census2001.dta
2. prev_qrtr_metro_census2001.dta
3. prev_qrtr_all_census2001.dta
and so on.
Each file has four main variables: credit, deposit, office, and cflag2001
Since I want to ultimately merge all the files, I am trying to create a loop that would allow me to change the varnames to
1. credit_allnatlb, deposit_allnatb, office_allnatb
2. credit_metro, deposit_metro, office_metro
3. credit_all, deposit_all, office_all
I have displayed the do-file below. I am trying to use regex commands but obviously, there is something wrong with the syntax. Any advice would be much appreciated.
local filelist : dir "C:/Research/RBI data/Statement 4A/" files "*.dta"
foreach f of local filelist {
use "`f'"
destring credit* deposit*, replace ignore(`","')
reshape long office deposit credit, i( statedist) j(_quarter) string
gen quarter = quarterly(_quarter, "YQ")
format quarter %tq
drop _quarter
*filename pattern:
local j= regexs(3) if regexm("`f'","([a-zA-Z]+)[_]([a-zA-Z]+)[_]([a-zA-Z]+)[_]([a-zA-Z]+)([0-9]+)[.]([a-zA-Z]+)")
rename credit credit_"`j'"
rename deposit deposit_"`j'"
rename office office_"`j'"
rename cflag2001 cflag2001_"`j'"
save "C:/Research/data/inter/`f'", replace
}
Thanks.
Comment