Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Convert monthly datapoints to yearly average

    Hello everyone,

    I have datasets that are currently in the wide format. Variables for the time are in monthly frequences. I know how to convert what I need to the long format, but I also need to convert the data from monthly entries to a yearly average value. I.e., this is what the data resembles in the wide format.
    Country Variable v1 v2 v3 v4 v5 v6 ... v403
    Country 1 Indicator 1 2 3 4 5 5 5 ... 6
    Country 2 Indicator 1 2 2 2 2 3 3 ... 4
    ... ... ... ... ... ... ... ... ... ...
    Country 146 Indicator 1 5 5 6 6 5 5 ... 6
    Variables v have labels, i.e. v1: Jan-84, v2: Feb-84, v3: Mar-84, ..., v403: Jul-17.

    I need, for each country, to calculate a running average of the values for each year and reduce the set to a yearly dataset, for example for 1984: sum(v1Country 1+...+v12Country 1)/12

    Eventually, I want to transform it to the long format and merge it with other data that is already in an annual format to perform a regression. Any idea on how to do this?

  • #2
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str12(country variable) byte(v1 v2 v3 v4 v5 v6 v403)
    "Country 1 "   "Indicator 1 " 2 3 4 5 5 5 6
    "Country 2 "   "Indicator 1 " 2 2 2 2 3 3 4
    "Country 146 " "Indicator 1 " 5 5 6 6 5 5 6
    end
    label var v1 "Jan-84"
    label var v2 "Feb-84"
    label var v3 "Mar-84"
    label var v4 "Apr-84"
    label var v5 "May-84"
    label var v6 "Jun-84"
    label var v403 "Jul-17"
    
    reshape long v, i(country variable) j(mdate)
    replace mdate = mdate + tm(1983m12)
    format mdate %tm
    gen year = yofd(dofm(mdate))
    collapse (mean) v, by(country variable year)
    This code assumes that the variables v1-v403 are indeed consecutively numbered without gaps, and that they correspond, in order, to the months Jan 1984 through July 2017.

    If that is not the case, please post back with new example data that illustrates the actual variable names and how they correspond to months.

    In the future, when showing data examples, please use the -dataex- command to do so. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Comment

    Working...
    X