Hello all!
I'm fairly new to Stata. I have a bunch of farming data for each crop. I have 12 crops and 10 different pieces of information for each crop. So I have a variables crop_name_'name' like this:
* Example generated by -dataex-. To install: ssc install dataex
clear
input str11 crop_name_1 str17(crop_name_2 crop_name_3 crop_name_4 crop_name_5)
"Cow Peas" "Pigeon Peas" "Green Gram" "Maize" "Nyota Nyayo Beans"
"Cow Peas" "Green Gram" "Millet" "" ""
"Green Gram" "Maize" "Sorghum" "" ""
"Green Gram" "" "" "" ""
"Green Gram" "Maize" "" "" ""
"Cow Peas" "Green Gram" "" "" ""
"Green Gram" "Maize" "" "" ""
"Cow Peas" "Green Gram" "" "" ""
"Cow Peas" "Green Gram" "Millet" "" ""
Now, for each crop, I have corresponding information about various things (type of seed, amount harvested etc.) Each variable about crop information is of a different type. Here is what data for crop_name_1 looks like:
"Less than half of the land used of cultivation" "All year-around" "Respondent does not know the size of harvest in bags or kgs" . . . . . "No" . .
"Less than half of the land used of cultivation" "Two seasons in the year" "Kilos" . 40 . . . "Yes" 30 .
"Half of the land used for cultivation" "All year-around" "Kilos" . 50 . . . "No" . .
"All land used for cultivation" "Two seasons in the year" "Kilos" . 0 . . . "" . .
"More than half of the land used for cultivation" "Two seasons in the year" "Kilos" . 10 . . . "No" . .
"Half of the land used for cultivation" "Two seasons in the year" "Kilos" . 200 . . . "No" . .
"Less than half of the land used of cultivation" "Two seasons in the year" "Kilos" . 28 . . . "Yes" 14 .
"Half of the land used for cultivation" "Two seasons in the year" "Kilos" . 20 . . . "No" . .
"Half of the land used for cultivation" "One season in the year" "Bags" 90 . 3 . . "Yes" 2.5 .
"Less than half of the land used of cultivation" "Two seasons in the year" "Kilos" . 70 . . . "Yes" 70 .
"Less than half of the land used of cultivation" "One season in the year" "Respondent does not know the size of harvest in bags or kgs" . . . . . "No" . .
"Half of the land used for cultivation" "Two seasons in the year" "Kilos"
I also have some questions that take on multiple choice (1/0) for each crop.
input byte(b14_h_1_1 b14_h_2_1 b14_h_3_1 b14_h_4_1 b14_h_5_1 b14_h__97_1)
. . . . . .
0 0 1 0 0 0
. . . . . .
. . . . . .
. . . . . .
. . . . . .
0 0 1 0 0 0
. . . . . .
0 0 1 0 0 0
0 0 1 0 0 0
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
1 0 0 0 0 0
0 0 0 0 0 0
. . . . . .
. . . . . .
. . . . . .
0 0 1 0 0 0
0 0 1 0 0 0
As you can imagine. this is a nightmare for analysis. I have figured a way around this. I want to generate a variable for each of the 12 crops based on the value in "crop_name_#". I could use the following code:
gen b14a_cowpeas = b14_a_1 if crop_name_1 == "Cow Peas"
replace b14_a_cowpeas = b14_a_2 if crop_name_2 == "Cow Peas"
However, I cannot figure out how to put this in a loop for each crop and each variable. I
How would I go about writing this loop?
I'm fairly new to Stata. I have a bunch of farming data for each crop. I have 12 crops and 10 different pieces of information for each crop. So I have a variables crop_name_'name' like this:
* Example generated by -dataex-. To install: ssc install dataex
clear
input str11 crop_name_1 str17(crop_name_2 crop_name_3 crop_name_4 crop_name_5)
"Cow Peas" "Pigeon Peas" "Green Gram" "Maize" "Nyota Nyayo Beans"
"Cow Peas" "Green Gram" "Millet" "" ""
"Green Gram" "Maize" "Sorghum" "" ""
"Green Gram" "" "" "" ""
"Green Gram" "Maize" "" "" ""
"Cow Peas" "Green Gram" "" "" ""
"Green Gram" "Maize" "" "" ""
"Cow Peas" "Green Gram" "" "" ""
"Cow Peas" "Green Gram" "Millet" "" ""
Now, for each crop, I have corresponding information about various things (type of seed, amount harvested etc.) Each variable about crop information is of a different type. Here is what data for crop_name_1 looks like:
"Less than half of the land used of cultivation" "All year-around" "Respondent does not know the size of harvest in bags or kgs" . . . . . "No" . .
"Less than half of the land used of cultivation" "Two seasons in the year" "Kilos" . 40 . . . "Yes" 30 .
"Half of the land used for cultivation" "All year-around" "Kilos" . 50 . . . "No" . .
"All land used for cultivation" "Two seasons in the year" "Kilos" . 0 . . . "" . .
"More than half of the land used for cultivation" "Two seasons in the year" "Kilos" . 10 . . . "No" . .
"Half of the land used for cultivation" "Two seasons in the year" "Kilos" . 200 . . . "No" . .
"Less than half of the land used of cultivation" "Two seasons in the year" "Kilos" . 28 . . . "Yes" 14 .
"Half of the land used for cultivation" "Two seasons in the year" "Kilos" . 20 . . . "No" . .
"Half of the land used for cultivation" "One season in the year" "Bags" 90 . 3 . . "Yes" 2.5 .
"Less than half of the land used of cultivation" "Two seasons in the year" "Kilos" . 70 . . . "Yes" 70 .
"Less than half of the land used of cultivation" "One season in the year" "Respondent does not know the size of harvest in bags or kgs" . . . . . "No" . .
"Half of the land used for cultivation" "Two seasons in the year" "Kilos"
I also have some questions that take on multiple choice (1/0) for each crop.
input byte(b14_h_1_1 b14_h_2_1 b14_h_3_1 b14_h_4_1 b14_h_5_1 b14_h__97_1)
. . . . . .
0 0 1 0 0 0
. . . . . .
. . . . . .
. . . . . .
. . . . . .
0 0 1 0 0 0
. . . . . .
0 0 1 0 0 0
0 0 1 0 0 0
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
1 0 0 0 0 0
0 0 0 0 0 0
. . . . . .
. . . . . .
. . . . . .
0 0 1 0 0 0
0 0 1 0 0 0
As you can imagine. this is a nightmare for analysis. I have figured a way around this. I want to generate a variable for each of the 12 crops based on the value in "crop_name_#". I could use the following code:
gen b14a_cowpeas = b14_a_1 if crop_name_1 == "Cow Peas"
replace b14_a_cowpeas = b14_a_2 if crop_name_2 == "Cow Peas"
However, I cannot figure out how to put this in a loop for each crop and each variable. I
How would I go about writing this loop?

Comment