Hello,
I want to rearrange the values of 314 string variables (containing 107 to 166 country names) so that each specific value of each variable is in the same observation line of all other variables for that specific value. The string variables represent different calendar dates on which a country is present.
Each country is intended to be rearranged in one line, either present or absent in each of the string variables. I am using Stata SE 14.2.
The data set is a list of country names present or absent in calendar dates. On the first date, 107 countries are present in full data (9 countries in sample data below). On each of the next dates, some additional countries are also present, while some of the countries present on previous dates might be absent. As such, each of the string variables represents a date on which each country is present or absent. Within each date, country names are sorted alphabetically. Variable names include the dates, e.g., the variable named countries20200417 shows the countries present on 17 April 2020.
The objective is to generate (a) a variable that shows every country present on any date (a list of all countries), (b) a variable that shows the earliest date each country was present, and (c) a set of variables that show dates on which each country was absent (after their first presence). In the second data set below, these variables are named (a) countries_all, (b) date_present_first, and (c) date_absent_1 date_absent_2
The question is how to go from the original data set to the described wanted data set. Could not find commands that rearrange the values of string variables as such.
A sample of the original data is as follows.
The wanted data for the sample data is as follows.
Thank you,
Farshad
I want to rearrange the values of 314 string variables (containing 107 to 166 country names) so that each specific value of each variable is in the same observation line of all other variables for that specific value. The string variables represent different calendar dates on which a country is present.
Each country is intended to be rearranged in one line, either present or absent in each of the string variables. I am using Stata SE 14.2.
The data set is a list of country names present or absent in calendar dates. On the first date, 107 countries are present in full data (9 countries in sample data below). On each of the next dates, some additional countries are also present, while some of the countries present on previous dates might be absent. As such, each of the string variables represents a date on which each country is present or absent. Within each date, country names are sorted alphabetically. Variable names include the dates, e.g., the variable named countries20200417 shows the countries present on 17 April 2020.
The objective is to generate (a) a variable that shows every country present on any date (a list of all countries), (b) a variable that shows the earliest date each country was present, and (c) a set of variables that show dates on which each country was absent (after their first presence). In the second data set below, these variables are named (a) countries_all, (b) date_present_first, and (c) date_absent_1 date_absent_2
The question is how to go from the original data set to the described wanted data set. Could not find commands that rearrange the values of string variables as such.
A sample of the original data is as follows.
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str22(countries20200417 countries20200424) str24 countries20220915 "Afghanistan" "" "Afghanistan" "Albania" "" "" "Canada" "Canada" "Bosnia and Herzegovina" "Chile" "Chile" "Botswana" "Colombia" "Colombia" "Brazil" "Costa Rica" "Congo (Brazzaville)" "Bulgaria" "Cote d'Ivoire" "Congo (Kinshasa)" "Burkina Faso" "Croatia" "Costa Rica" "Burundi" "Cuba" "Cote d'Ivoire" "Cabo Verde" "" "United Arab Emirates" "Nigeria" "" "United Kingdom" "North Macedonia" "" "Uruguay" "Norway" "" "Uzbekistan" "Oman" "" "Venezuela" "Pakistan" "" "Vietnam" "" "" "" "Papua New Guinea" "" "" "Paraguay" "" "" "US" "" "" "Uganda" "" "" "Ukraine" "" "" "United Arab Emirates" "" "" "United Kingdom" "" "" "Uruguay" "" "" "Uzbekistan" "" "" "Venezuela" "" "" "Vietnam" "" "" "Yemen" "" "" "Zambia" "" "" "Zimbabwe" end
The wanted data for the sample data is as follows.
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str13 countries20200417 str20 countries20200424 str22(countries20220915 countries_all) long(date_present_first date_absent_1 date_absent_2) "Afghanistan" "" "Afghanistan" "Afghanistan" 20200417 20200424 . "Albania" "" "" "Albania" 20200417 20200424 20220915 "" "" "Bosnia and Herzegovina" "Bosnia and Herzegovina" 20220915 . . "" "" "Botswana" "Botswana" 20220915 . . "" "" "Brazil" "Brazil" 20220915 . . "" "" "Bulgaria" "Bulgaria" 20220915 . . "" "" "Burkina Faso" "Burkina Faso" 20220915 . . "" "" "Burundi" "Burundi" 20220915 . . "" "" "Cabo Verde" "Cabo Verde" 20220915 . . "Canada" "Canada" "" "Canada" 20200417 20220915 . "Chile" "Chile" "" "Chile" 20200417 20220915 . "Colombia" "Colombia" "" "Colombia" 20200417 20220915 . "" "Congo (Brazzaville)" "" "Congo (Brazzaville)" 20200424 20220915 . "" "Congo (Kinshasa)" "" "Congo (Kinshasa)" 20200424 20220915 . "Costa Rica" "Costa Rica" "" "Costa Rica" 20200417 20220915 . "Cote d'Ivoire" "Cote d'Ivoire" "" "Cote d'Ivoire" 20200417 20220915 . "Croatia" "" "" "Croatia" 20200417 20200424 20220915 "Cuba" "" "" "Cuba" 20200417 20200424 20220915 "" "" "Nigeria" "Nigeria" 20220915 . . "" "" "North Macedonia" "North Macedonia" 20220915 . . "" "" "Norway" "Norway" 20220915 . . "" "" "Oman" "Oman" 20220915 . . "" "" "Pakistan" "Pakistan" 20220915 . . "" "" "Papua New Guinea" "Papua New Guinea" 20220915 . . "" "" "Paraguay" "Paraguay" 20220915 . . "" "" "Uganda" "Uganda" 20220915 . . "" "" "Ukraine" "Ukraine" 20220915 . . "" "United Arab Emirates" "United Arab Emirates" "United Arab Emirates" 20200424 . . "" "United Kingdom" "United Kingdom" "United Kingdom" 20200424 . . "" "Uruguay" "Uruguay" "Uruguay" 20200424 . . "" "" "US" "US" 20220915 . . "" "Uzbekistan" "Uzbekistan" "Uzbekistan" 20200424 . . "" "Venezuela" "Venezuela" "Venezuela" 20200424 . . "" "Vietnam" "Vietnam" "Vietnam" 20200424 . . "" "" "Yemen" "Yemen" 20220915 . . "" "" "Zambia" "Zambia" 20220915 . . "" "" "Zimbabwe" "Zimbabwe" 20220915 . . end
Farshad

Comment