Assigning classifications/status of a country for all years in panel data

Muhammad Ibrahim Shah

Join Date: Nov 2020

Posts: 49
#1

Assigning classifications/status of a country for all years in panel data

14 Jun 2022, 13:24

I have panel data of more than 100 countries and 30 years. I want to assign these countries to low, lower-middle, upper-middle and high-income countries. When I download the world bank classification from the world bank website, all I get is the classified group for each country for a single year, but I want to assign the income status of each country for all the years. For example, a country is in a lower-middle-income country for the year 2020, but I want to assign this lower-middle-income status to that country for all the 30 years (i.e. from 1990-2020, that country will be assigned lower-middle-income status). Similarly, I want to do that for all the countries in my dataset. It would take a huge amount of time in excel. Is there any command in Stata which can allow me to do that?

NB: If anyone is wondering whether it would be appropriate to assign lower-middle-income status to a country which was low-income in 1995, please ignore it. Instead of income classifications, it could be regional classifications which I wanted to assign.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30104
#2

14 Jun 2022, 14:39

If I understand your question, you have a data set with a country variable and a year variable, and each country (or, at least some of them) have multiple observations corresponding to different years. Then you have another data et that contains a country variable and also a region variable--which is static over time, so there is no year variable, and each country appears only once.

Code:

use first_data_set, clear merge m:1 country using region_data_set

will combine them in the way you want: the same region value will be applied to the country for every year it appears in the first data set.

The major problem you may encounter with this approach is if the names of the countries are not consistent across the data sets. For example, if one data set has "United States" and the other has "United States of America," or one has "Great Britain" and the other has "United Kingdom," Stata will not match those. Even differences in capitalization or punctuation/abbreviation can mess this up. It is much better, if you can find appropriate data sets with the right information, to use ISO codes or other standardized designations for the individual countries. If no such variables can be found in data sets with the other variables you need, Rafal Raciborski's -kountry- package, available from SSC, can help you deal with the problem of irregular country names.

If I have misunderstood your situation, please post back showing example data from both data sets. Please use the -dataex- command to do that. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

When asking for help with code, always show example data. When showing example data, always use -dataex-.
Comment
Jared Greathouse

Join Date: Sep 2021

Posts: 2172
#3

14 Jun 2022, 17:27

If this were my problem, I would LIKELY scrape these into a working directory with python, import the excel files and save them as temporary datasets, then append them on top of each other. I would then check spelling, and then (if I've understood you well), merge them as Clyde suggests
Comment
Muhammad Ibrahim Shah

Join Date: Nov 2020

Posts: 49
#4

14 Jun 2022, 23:18

Thank you Clyde Schechter, I will try that and let you know. Jared Greathouse Not familiar with python, unfortunately, but thank you for the suggestion.
Comment
Felix Kaysers

Join Date: Oct 2022

Posts: 65
#5

20 Jul 2023, 13:58

The package wbopendata allows to download indicators and comes with metadata that includes the incomelevel. I am not sure if the variable changes or not, but I assume that it should be possible to replace the values for a country with a specific value in a year for that country.

Cheers,
Felix
Stata Version: MP 18.0
OS: Windows 11
Comment

Announcement

Assigning classifications/status of a country for all years in panel data

Comment

Comment

Comment

Comment