Hello Statalisters,
I am working with several hundred excel sheets that I want to import into Stata. They have a modicum of uniformity that allows me to import them but at the final step of reshaping them from wide to long I am stumbling. I have given you a sample of the data below.
I want to reshape the data where each district has an observation of the variables starting with total. However, each state for which I have a single excel file have differing numbers of district. Which could be titled from G to CG. How do I go about reshaping this to reflect variable name district with all the values given under the name G to CG instead of existing varnames in long?
It should also be noted that I initially import these from excel and keep only a few variables that I need using the following code :
Hereunder are two excel sheets from different states I have imported. I wanted the "observations" I imported above to be reshaped to be variables names. And existing "variables" ranging from G to CG depending on the states to be under the variable district. Hope this is not too confusing and within Statalist guidelines.
Lori
I am working with several hundred excel sheets that I want to import into Stata. They have a modicum of uniformity that allows me to import them but at the final step of reshaping them from wide to long I am stumbling. I have given you a sample of the data below.
I want to reshape the data where each district has an observation of the variables starting with total. However, each state for which I have a single excel file have differing numbers of district. Which could be titled from G to CG. How do I go about reshaping this to reflect variable name district with all the values given under the name G to CG instead of existing varnames in long?
It should also be noted that I initially import these from excel and keep only a few variables that I need using the following code :
Code:
bysort B (C): keep if C[1] == "district" | C[1] == "totalpublic" | C[1] == "totalpublic_inst" | C[1] == "totalpublic_instcol" | C[1] == "totalpvt" | C[1] == "totalpvt_inst" | C[1] == "totalpvt_instcol"
Code:
* Example generated by -dataex-. For more info, type help dataex clear input str196 C str7 F str16 G str13 H "district" "Nicobar" "North and Middle" "South Andaman" "totalpublic" "" "" "4629" "totalpublic_inst" "" "" "" "totalpublic_instcol" "" "" "" "totalpvt" "" "" "" "totalpvt_inst" "" "" "" "totalpvt_instcol" "" "" "" end
Code:
* Example generated by -dataex-. For more info, type help dataex clear input str196 C str8 E str7(F G) str10 H str6 I str9(J K) str7 L str6 M str9 N str14 O str7 P str9 Q str7 R "district" "_Bihar" "Araria" "Arwal" "Aurangabad" "Banka" "Begusarai" "Bhagalpur" "Bhojpur" "Buxar" "Darbhanga" "East Champaran" "Gaya" "Gopalganj" "Jamui" "totalpublic" "535688" "20118" "4082" "27539" "25147" "3061" "34146" "30658" "" "11016" "31357" "22317" "10651" "12187" "totalpublic_inst" "427820" "21058" "5752" "9724" "26358" "104" "4030" "9707" "11042" "" "48925" "" "" "16231" "totalpublic_instcol" "338565" "12567" "4582" "3603" "25422" "98" "2914" "5610" "7143" "" "47935" "" "" "16231" "totalpvt" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "totalpvt_inst" "14853" "" "" "4017" "" "" "" "" "381" "" "" "" "" "2969" "totalpvt_instcol" "12703" "" "" "3955" "" "" "" "" "310" "" "" "" "" "2969" end
Comment