Hello Statalisters!
I have a huge cross-country dataset with a lot of variables. Some of them are available for every country, some of them are only avaialble for a handful of countries. For the sake of having a useful, readable dataset that fits the needs of my study I'd like to keep all variables that are available in each country, or let's say at least 80% of my sample of countries. Is there a quick way to do this? Here's an example of my data:
For instance, in this exampleI would like to keep solely var1, because even if it is full of 0s, it still gathers information for every country of this sample. However, var2 only gathers data for a few countries that are not in this example and var3 only gathers data for Brazil. I'd like to ask Stata to drop var2 and var3, and more generally every variable that are missing for a majority of countries.
Thank you for the help!
Regards,
Adam
I have a huge cross-country dataset with a lot of variables. Some of them are available for every country, some of them are only avaialble for a handful of countries. For the sake of having a useful, readable dataset that fits the needs of my study I'd like to keep all variables that are available in each country, or let's say at least 80% of my sample of countries. Is there a quick way to do this? Here's an example of my data:
Code:
* Example generated by -dataex-. For more info, type help dataex clear input str32 country_name float var1 double(var2 var3) "Afghanistan" 0 . . "Albania" 0 . . "Algeria" 0 . . "Angola" 0 . . "Argentina" 0 . . "Australia" 1.2258065 . . "Austria" 0 . . "Azerbaijan" 0 . . "Bahrain" 0 . . "Bangladesh" 0 . . "Barbados" 0 . . "Belarus" 0 . . "Belgium" 0 . . "Benin" 0 . . "Bhutan" 0 . . "Bolivia" 0 . . "Bosnia and Herzegovina" 0 . . "Botswana" 0 . . "Brazil" 0 . .678 "Bulgaria" 0 . . "Burkina Faso" 0 . . "Burma/Myanmar" 0 . . "Burundi" 0 . . "Cambodia" .1612903 . . "Cameroon" 0 . . end
Thank you for the help!
Regards,
Adam

Comment