Hello Statalist users,
I am using Stata 13.1 and have two questions regarding using loop over for other commands.
Background: The downloaded data was an Excel spreadsheet with each US State on a separate sheet. I was able to remove the first two rows which had unnecessary information by starting in a specific cell. I then used the firstrow command to import the variable names. However, the first column took the variable name based on which State the sheet was regarding since the first cell had the state name. Below is the current working code.
Problem 1) I need to drop missing observations as there are blank rows.I originally used the code below to drop variables, but this would only work on one data file as the other variables are not named "Alabama". Is there another way to drop the missing observations while using the loop over code since the first variable name is different for every data file?
The other current columns or variables are years and some states have different years available. For instance, Alabama has 1924-2014 while Alaska has 1960-2014. I am hesitant to use a drop missing observation with the year variables as sometimes years are randomly not available.
Problem 2) I want to generate an additional variable "STATE" in order to transpose or reshape the data files so they will append appropriately. However, the only place each file has the name of the state is the first variable name. Is there a way to use the variable name as a value in a newly generated variable so I can put it into the loop over command as well?
My only progress with this problem was using the information from http://www.statalist.org/forums/foru...abels-in-xpose where I tried the following code on one data file:
Unfortunately, that did not even work with one single file as the "invalid 'labeler'" error appeared.
Thank you in advance for any information and advice.
Amie
I am using Stata 13.1 and have two questions regarding using loop over for other commands.
Background: The downloaded data was an Excel spreadsheet with each US State on a separate sheet. I was able to remove the first two rows which had unnecessary information by starting in a specific cell. I then used the firstrow command to import the variable names. However, the first column took the variable name based on which State the sheet was regarding since the first cell had the state name. Below is the current working code.
Code:
xls2dta, save ("C:\filelocation") sheets(2/52): import excel "C:\excelfile", cellrange(A3) firstrow clear *******Loop over all Files****************** forvalues i=2/52 { use "C:\filename_`i'.dta" drop in 1/2 **Dates to become labels** foreach var of varlist _all { local label : variable label `var' local new_name = upper(strtoname("`label'")) rename `var' `new_name' } rename (_*) (y*) save, replace }
The other current columns or variables are years and some states have different years available. For instance, Alabama has 1924-2014 while Alaska has 1960-2014. I am hesitant to use a drop missing observation with the year variables as sometimes years are randomly not available.
Code:
drop if missing( ALABAMA)
Problem 2) I want to generate an additional variable "STATE" in order to transpose or reshape the data files so they will append appropriately. However, the only place each file has the name of the state is the first variable name. Is there a way to use the variable name as a value in a newly generated variable so I can put it into the loop over command as well?
My only progress with this problem was using the information from http://www.statalist.org/forums/foru...abels-in-xpose where I tried the following code on one data file:
Code:
encode ALABAMA, gen(_CATEGORY) tempfile labeler label save _CATEGORY using 'labeler'
Thank you in advance for any information and advice.
Amie
Comment