Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to generate the log gdp per capita difference(growth) in panel dataset?*

    Hi Statalist Community,

    I am working with a panel dataset where I am trying to create gdp_growth from the log of real gdp per capita.

    I would like to generate the 5-year difference(growth) of country A(and other countries) between log_gdp_1994 and log_gdp_1990, then the difference of log_gdp_1999 and log_gdp_1995(so non-overlapping), and so on....how can this be done in Stata?

    Is there perhaps a command that generates these 5-year differences for the whole dataset?
    My dataset starts from year 1990 to 2018 for 60 different countries.

    Furthermore, let's assume that the log_gdp_pc in year 1990 of country A is missing. How may I take the average of the non-missing observations from the other years(1991-1993) and substract it from log_gdp_1994? How can I put it as a condition, that if the first observation of the difference( so, 1990, 1995, 2000, ect...)
    is missing from the 5-year differences, the command is to take the average of the non-missing other years without the last year(in the inquiry above the last year would be 1994), and substract it from the last year( year 1994) to get the growth?

    This is how the data looks like:
    Country year log_gdp_pc
    A 1990 1111
    A 1991 1212
    A 1992 1222
    A 1993 1221
    A 1994 2211
    A 1995 1212
    Thank you for your help!
    Kris

  • #2
    The following will, more or less, do what you want
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str2 country int(year log_gdp_pc)
    "A " 1990 1111
    "A " 1991 1212
    "A " 1992 1222
    "A " 1993 1221
    "A " 1994 2211
    "A " 1995 1212
    end
    
    
    gen epoch = 5*floor(year/5)
    by country epoch (year), sort: egen mean_non_miss = ///
        mean(cond(!inlist(mod(year, 5), 0, 4)), log_gdp_pc, .)
    by country epoch (year): gen start_log_gdp_pc = cond(missing(log_gdp_pc[1]), ///
        mean_non_miss, log_gdp_pc[1])
    by country epoch (year), sort: gen diff_5_yrs = log_gdp_pc[_N] - start_log_gdp_pc
    I say more or less because your question is not fully fleshed out. You don't say what you want to do when the final year (e.g. 1994) has a missing value of log_gdp_pc. You also don't say how you want to handle the final group of years, 2015-2018, in which the "last" year is a year short of where it "should" be. The code above handles it in a somewhat nonsensical way.

    In the future, when showing data examples, please use the -dataex- command to do so. If you are running version 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    You could also improve you example data by including more than one country, and more years.

    Comment

    Working...
    X