Adding a Year Suffix to DHS data

Annalivia Robinson

Join Date: Nov 2017

Posts: 3
#1

Adding a Year Suffix to DHS data

23 Nov 2017, 18:17

Hello,

I would like to add _16 to the end of all my variables in my dataset so that when I merge with a DHS data set of a prior year I can differentiate which variables correspond to which year (as the names of the vars are the same across years). I believe I should use a varlist to do so but I'm unsure of the exact code. After reading different posts, I believe it should look something like this:

foreach x of varlist `myvars' {
local y_`x' = regexr("`x'")
rename `x' v_16`x'
}

I successfully created the varlist "myvars" which includes all the variables in my data set, I just need to add the suffix now.
Thanks!

Annalivia
Tags: foreach
William Lisowski

Join Date: Dec 2014

Posts: 10150
#2

23 Nov 2017, 19:20

Welcome to Statalist, Annalivia.

If you look at the output of the command help rename, it tells you

Also see [D] rename group for renaming groups of variables.

Clicking on rename group in that output - or typing the command help rename group - will describe how to accomplish what you need. Example 11 suggests that

Code:

rename (`myvars') (=_16)

will serve your purposes without a loop, adding "_16" to the name of each variable in the `myvars' variable list.

In the spirit of answering the question you actually asked, a looping solution would be something like the following.

Code:

foreach x of varlist `myvars' { rename `x' `x'_16 }
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#3

23 Nov 2017, 19:32

But let me take a step further back. Perhaps you are analyzing longitudinal data, where for each individual you have observations of the same measures over several years.

If this is the case, then you will almost certainly need to use the techniques described in the Stata Longitudinal-Data/Panel-Data Reference Manual PDF. The reference manual PDFs are included in the Stata installation (since version 11) and are accessible from within Stata - for example, through Stata's Help menu. If you have not done so, you should review them before continuing.

To do so, you will not want to have for example income_14 income_15 and income_16 being the income measure in 2014, 2015, and 2016. Instead, you will want three observations for the individual, with a "year" variable identifying the year the observation corresponds to, and the corresponding income variable for each observation.

You will build your analysis dataset not with the merge command but rather with the append command. You will have data in what is called a "long" layout rather than a "wide" layout. The experienced users here generally agree that, with few exceptions, Stata makes it much more straightforward to accomplish complex analyses using a long layout of your data rather than a wide layout of the same data.
Comment
Annalivia Robinson

Join Date: Nov 2017

Posts: 3
#4

24 Nov 2017, 12:16

Thank you for your tips William! Unfortunately, the data is not longitudinal, as they do not interview the same families in each wave of the surveys. I believe I will have have to merge it on the cluster, or village level. If you have any tips on how to do this, I would also greatly appreciate this.
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#5

24 Nov 2017, 12:42

Then you certainly want to use append rather than merge, as I described above, and treat the analysis as a pooled cross-sectional data. Going the route you are going with variable renaming is a mistake.
Comment
Annalivia Robinson

Join Date: Nov 2017

Posts: 3
#6

24 Nov 2017, 12:45

OK. Thanks for the advice!
Comment

Announcement

Adding a Year Suffix to DHS data

Comment

Comment

Comment

Comment

Comment