Hello,
I keep getting the error "variables providerid year do not uniquely identify observations in the master data". However, I am not sure how else to uniquely identify the master data as I have tried adding more variables to it.
Here is my code:
*Identify 2014 General Acute Hospitals (GAH)
import delimited "/Users/markflores/Desktop/POS files/HOSArchive_Revised_Flatfiles_20141023/Hospital General Information.csv", stringcols(1) clear
gen temp=substr(providerid,3,4)
gen genhospital=0
replace genhospital=1 if real(temp)>=0001 & real(temp)<=0879
keep if genhospital==1
rename providerid oldproviderid
keep oldproviderid address city state zipcode
isid address city state zipcode
save 2014noncahdata.dta, replace
*Identify 2015 Critical Access Hospitals (CAH)
import delimited "/Users/markflores/Desktop/POS files/hos_revised_flatfiles_archive_10_2015/Hospital General Information.csv", stringcols(1) clear
gen temp=substr(providerid,3,4)
gen cah=0
replace cah=1 if real(temp)>=1300 & real(temp)<=1399
keep if cah==1
save 2015cahdata.dta, replace
*Merge To Identify Possible Conversion
merge 1:1 address city state zipcode using 2014noncahdata.dta
keep providerid _merge
//drop if _merge==2
gen year=2015
save 2015cahmergedata.dta, replace
*Identify 2015 General Acute Hospitals (GAH)
import delimited "/Users/markflores/Desktop/POS files/hos_revised_flatfiles_archive_10_2015/Hospital General Information.csv", stringcols(1) clear
gen temp=substr(providerid,3,4)
gen genhospital=0
replace genhospital=1 if real(temp)>=0001 & real(temp)<=0879
keep if genhospital==1
rename providerid oldproviderid
keep oldproviderid address city state zipcode
isid address city state zipcode
save 2015noncahdata.dta, replace
*Identify 2016 Critical Access Hospitals (CAH)
import delimited "/Users/markflores/Desktop/hospital data/hospitals_archive_2016/hos_revised_flatfiles_archive_11_2016/Hospital General Information.csv", stringcols(1) clear
gen temp=substr(providerid,3,4)
gen cah=0
replace cah=1 if real(temp)>=1300 & real(temp)<=1399
keep if cah==1
save 2016cahdata.dta, replace
*Merge To Identify Possible Conversion
merge 1:1 address city state zipcode using 2015noncahdata.dta
keep providerid _merge
//drop if _merge==2
gen year=2016
save 2016cahmergedata.dta, replace
*Identify 2016 General Acute Hospitals (GAH)
import delimited "/Users/markflores/Desktop/hospital data/hospitals_archive_2016/hos_revised_flatfiles_archive_11_2016/Hospital General Information.csv", stringcols(1) clear
gen temp=substr(providerid,3,4)
gen genhospital=0
replace genhospital=1 if real(temp)>=0001 & real(temp)<=0879
keep if genhospital==1
rename providerid oldproviderid
keep oldproviderid address city state zipcode
isid address city state zipcode
save 2016noncahdata.dta, replace
*Identify 2017 Critical Access Hospitals (CAH)
import delimited "/Users/markflores/Desktop/hospital data/hospitals_archive_2017/hos_revised_flatfiles_archive_10_2017/Hospital General Information.csv", stringcols(1) clear
gen temp=substr(providerid,3,4)
gen cah=0
replace cah=1 if real(temp)>=1300 & real(temp)<=1399
keep if cah==1
save 2017cahdata.dta, replace
*Merge To Identify Possible Conversion
merge 1:1 address city state zipcode using 2016noncahdata.dta
keep providerid _merge
//drop if _merge==2
gen year=2017
save 2017cahmergedata.dta, replace
*Identify 2017 General Acute Hospitals (GAH)
import delimited "/Users/markflores/Desktop/hospital data/hospitals_archive_2017/hos_revised_flatfiles_archive_10_2017/Hospital General Information.csv", stringcols(1) clear
gen temp=substr(providerid,3,4)
gen genhospital=0
replace genhospital=1 if real(temp)>=0001 & real(temp)<=0879
keep if genhospital==1
rename providerid oldproviderid
keep oldproviderid address city state zipcode
isid address city state zipcode
save 2017noncahdata.dta, replace
*Identify 2018 Critical Access Hospitals (CAH)
import delimited "/Users/markflores/Desktop/hospital data/hospitals_archive_2018/hos_revised_flatfiles_archive_10_2018/Hospital General Information.csv", stringcols(1) clear
gen temp=substr(providerid,3,4)
gen cah=0
replace cah=1 if real(temp)>=1300 & real(temp)<=1399
keep if cah==1
save 2018cahdata.dta, replace
*Merge To Identify Possible Conversion
merge 1:1 address city state zipcode using 2017noncahdata.dta
keep providerid _merge
//drop if _merge==2
gen year=2018
save 2018cahmergedata.dta, replace
*Identify 2018 General Acute Hospitals (GAH)
import delimited "/Users/markflores/Desktop/hospital data/hospitals_archive_2018/hos_revised_flatfiles_archive_10_2018/Hospital General Information.csv", stringcols(1) clear
gen temp=substr(providerid,3,4)
gen genhospital=0
replace genhospital=1 if real(temp)>=0001 & real(temp)<=0879
keep if genhospital==1
rename providerid oldproviderid
keep oldproviderid address city state zipcode
isid address city state zipcode
save 2018noncahdata.dta, replace
*Identify 2019 Critical Access Hospitals (CAH)
import delimited "/Users/markflores/Desktop/hospital data/hospitals_archive_2019/hos_revised_flatfiles_archive_10_2019/Hospital General Information.csv", stringcols(1) clear
rename facilityid providerid
gen temp=substr(providerid,3,4)
gen cah=0
replace cah=1 if real(temp)>=1300 & real(temp)<=1399
keep if cah==1
save 2019cahdata.dta, replace
*Merge To Identify Possible Conversion
merge 1:1 address city state zipcode using 2018noncahdata.dta
keep providerid _merge
//drop if _merge==2
gen year=2019
save 2019cahmergedata.dta, replace
*Appending Data for Hospitals That Converted to CAH
use 2015cahmergedata.dta, clear
append using 2016cahmergedata.dta
append using 2017cahmergedata.dta
append using 2018cahmergedata.dta
append using 2019cahmergedata.dta
save cahconversiondata.dta, replace
*Merge to CAH Master List
use cahconversiondata.dta, clear
merge 1:1 providerid year using fullymergedcahdata.dta, gen (_new_merge)
variables providerid year do not uniquely identify observations in the master
data
r(459);
end of do-file
r(459);
Thanks!
I keep getting the error "variables providerid year do not uniquely identify observations in the master data". However, I am not sure how else to uniquely identify the master data as I have tried adding more variables to it.
Here is my code:
*Identify 2014 General Acute Hospitals (GAH)
import delimited "/Users/markflores/Desktop/POS files/HOSArchive_Revised_Flatfiles_20141023/Hospital General Information.csv", stringcols(1) clear
gen temp=substr(providerid,3,4)
gen genhospital=0
replace genhospital=1 if real(temp)>=0001 & real(temp)<=0879
keep if genhospital==1
rename providerid oldproviderid
keep oldproviderid address city state zipcode
isid address city state zipcode
save 2014noncahdata.dta, replace
*Identify 2015 Critical Access Hospitals (CAH)
import delimited "/Users/markflores/Desktop/POS files/hos_revised_flatfiles_archive_10_2015/Hospital General Information.csv", stringcols(1) clear
gen temp=substr(providerid,3,4)
gen cah=0
replace cah=1 if real(temp)>=1300 & real(temp)<=1399
keep if cah==1
save 2015cahdata.dta, replace
*Merge To Identify Possible Conversion
merge 1:1 address city state zipcode using 2014noncahdata.dta
keep providerid _merge
//drop if _merge==2
gen year=2015
save 2015cahmergedata.dta, replace
*Identify 2015 General Acute Hospitals (GAH)
import delimited "/Users/markflores/Desktop/POS files/hos_revised_flatfiles_archive_10_2015/Hospital General Information.csv", stringcols(1) clear
gen temp=substr(providerid,3,4)
gen genhospital=0
replace genhospital=1 if real(temp)>=0001 & real(temp)<=0879
keep if genhospital==1
rename providerid oldproviderid
keep oldproviderid address city state zipcode
isid address city state zipcode
save 2015noncahdata.dta, replace
*Identify 2016 Critical Access Hospitals (CAH)
import delimited "/Users/markflores/Desktop/hospital data/hospitals_archive_2016/hos_revised_flatfiles_archive_11_2016/Hospital General Information.csv", stringcols(1) clear
gen temp=substr(providerid,3,4)
gen cah=0
replace cah=1 if real(temp)>=1300 & real(temp)<=1399
keep if cah==1
save 2016cahdata.dta, replace
*Merge To Identify Possible Conversion
merge 1:1 address city state zipcode using 2015noncahdata.dta
keep providerid _merge
//drop if _merge==2
gen year=2016
save 2016cahmergedata.dta, replace
*Identify 2016 General Acute Hospitals (GAH)
import delimited "/Users/markflores/Desktop/hospital data/hospitals_archive_2016/hos_revised_flatfiles_archive_11_2016/Hospital General Information.csv", stringcols(1) clear
gen temp=substr(providerid,3,4)
gen genhospital=0
replace genhospital=1 if real(temp)>=0001 & real(temp)<=0879
keep if genhospital==1
rename providerid oldproviderid
keep oldproviderid address city state zipcode
isid address city state zipcode
save 2016noncahdata.dta, replace
*Identify 2017 Critical Access Hospitals (CAH)
import delimited "/Users/markflores/Desktop/hospital data/hospitals_archive_2017/hos_revised_flatfiles_archive_10_2017/Hospital General Information.csv", stringcols(1) clear
gen temp=substr(providerid,3,4)
gen cah=0
replace cah=1 if real(temp)>=1300 & real(temp)<=1399
keep if cah==1
save 2017cahdata.dta, replace
*Merge To Identify Possible Conversion
merge 1:1 address city state zipcode using 2016noncahdata.dta
keep providerid _merge
//drop if _merge==2
gen year=2017
save 2017cahmergedata.dta, replace
*Identify 2017 General Acute Hospitals (GAH)
import delimited "/Users/markflores/Desktop/hospital data/hospitals_archive_2017/hos_revised_flatfiles_archive_10_2017/Hospital General Information.csv", stringcols(1) clear
gen temp=substr(providerid,3,4)
gen genhospital=0
replace genhospital=1 if real(temp)>=0001 & real(temp)<=0879
keep if genhospital==1
rename providerid oldproviderid
keep oldproviderid address city state zipcode
isid address city state zipcode
save 2017noncahdata.dta, replace
*Identify 2018 Critical Access Hospitals (CAH)
import delimited "/Users/markflores/Desktop/hospital data/hospitals_archive_2018/hos_revised_flatfiles_archive_10_2018/Hospital General Information.csv", stringcols(1) clear
gen temp=substr(providerid,3,4)
gen cah=0
replace cah=1 if real(temp)>=1300 & real(temp)<=1399
keep if cah==1
save 2018cahdata.dta, replace
*Merge To Identify Possible Conversion
merge 1:1 address city state zipcode using 2017noncahdata.dta
keep providerid _merge
//drop if _merge==2
gen year=2018
save 2018cahmergedata.dta, replace
*Identify 2018 General Acute Hospitals (GAH)
import delimited "/Users/markflores/Desktop/hospital data/hospitals_archive_2018/hos_revised_flatfiles_archive_10_2018/Hospital General Information.csv", stringcols(1) clear
gen temp=substr(providerid,3,4)
gen genhospital=0
replace genhospital=1 if real(temp)>=0001 & real(temp)<=0879
keep if genhospital==1
rename providerid oldproviderid
keep oldproviderid address city state zipcode
isid address city state zipcode
save 2018noncahdata.dta, replace
*Identify 2019 Critical Access Hospitals (CAH)
import delimited "/Users/markflores/Desktop/hospital data/hospitals_archive_2019/hos_revised_flatfiles_archive_10_2019/Hospital General Information.csv", stringcols(1) clear
rename facilityid providerid
gen temp=substr(providerid,3,4)
gen cah=0
replace cah=1 if real(temp)>=1300 & real(temp)<=1399
keep if cah==1
save 2019cahdata.dta, replace
*Merge To Identify Possible Conversion
merge 1:1 address city state zipcode using 2018noncahdata.dta
keep providerid _merge
//drop if _merge==2
gen year=2019
save 2019cahmergedata.dta, replace
*Appending Data for Hospitals That Converted to CAH
use 2015cahmergedata.dta, clear
append using 2016cahmergedata.dta
append using 2017cahmergedata.dta
append using 2018cahmergedata.dta
append using 2019cahmergedata.dta
save cahconversiondata.dta, replace
*Merge to CAH Master List
use cahconversiondata.dta, clear
merge 1:1 providerid year using fullymergedcahdata.dta, gen (_new_merge)
variables providerid year do not uniquely identify observations in the master
data
r(459);
end of do-file
r(459);
Thanks!

Comment