add incremental values to existing variable

Jowel Choufani

Join Date: Jun 2017

Posts: 17
#1

add incremental values to existing variable

31 Jan 2018, 09:01

Hello,

I have a household roster that contains household number, household member name, and member ID. This roster served as a baseline roster, and was used to collect end line data. All NEW members that were not in the baseline but were in the end line (new additions to the household either through marriage or birth) are now in the roster, but do not have a member ID. Is there a way to automatically generate a member ID for these new members that is n+1 from the highest value of baseline household member IDs?

Example:
Household Number Member Name Member ID

1 Alex 1

1 Jane 2

1 Sarah .

2 Ali 1

2 Omar .

Is there a command that would replace the missing ID for Sarah with "3" and Omar with "2"? Sometimes I have 2 new members, so it would need to be n+1, then n+2.

Many thanks,
Jowel

Last edited by Jowel Choufani; 31 Jan 2018, 09:06.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30104
#2

31 Jan 2018, 09:40

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input byte householdnumber str5 membername byte memberid 1 "Alex" 1 1 "Jane" 2 1 "Sarah" . 2 "Ali" 1 2 "Omar" . end by householdnumber (memberid), sort: /// replace memberid = memberid[_n-1]+1 if missing(memberid)

The logic here is that when the data are sorted on memberid (within householdnumber), the missing values sort last. Then it is just a matter of adding 1 to each subsequent memberid in the household group.

In the future, please use the -dataex- command to show your example data, as I have done in this response. The table you show has the following drawbacks:

1. It took you longer to create than it would have taken you to use -dataex-.
2. It isn't actually Stata data, because the column headers in the table are not legal Stata variable names, due to embedded blanks.
3. It leaves anyone responding to make assumptions about the data which, if wrong, will lead to incorrect solutions to your problem. For example, I'm assuming that memberid is actually a numeric variable, not a string that happens to look like numbers. If I have that wrong, the code above will fail abysmally. Writing code for imaginary data is always speculative. If you provide a real data example with -dataex-, then the code can be tested out on the kind of data you need it to run on, and you have a much better chance of getting the right answer the first time.

If you are running Stata version 15.1, -dataex- is part of your official installation. If running an earlier version, run -ssc install dataex- to get the command. Either way, read -help dataex- for the simple instructions for using it. Going forward, whenever you want help with code, show a representative example of your Stata data. And whenever showing Stata data, use -dataex- to do it.
1 like
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35700
#3

31 Jan 2018, 09:47

Documented within FAQ https://www.stata.com/support/faqs/d...issing-values/ Section 7

You don't give a legal Stata data example (please do read and act on FAQ Advice #12), but something like

Code:

bysort Household (MemberID) : replace MemberID = MemberID[_n-1] + 1 if missing(MemberID)

would work for your example.
Comment
Jowel Choufani

Join Date: Jun 2017

Posts: 17
#4

31 Jan 2018, 11:30

Dear Clyde and Nick,

Thank you for your response. The code Clyde provided worked.

I will ensure to use the -dataex- command in the future.

Many thanks,
Jowel
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35700
#5

31 Jan 2018, 11:35

Just to point out that it's the same answer from both of us!
Comment
Jowel Choufani

Join Date: Jun 2017

Posts: 17
#6

31 Jan 2018, 13:45

Apologies for missing that, Nick!! Thanks a lot.
Comment
Aanchal Aggarwal

Join Date: Oct 2016

Posts: 1
#7

21 Jan 2020, 05:20

Dear Nick Cox
i have a household roster data in form like
respondent 1
name 1 status_in_HH var2data var3data
name 2 status_in_HH var2data var3data
.
.
.

respondent 2
name 1 status_in_HH var2data var3data
name 2 status_in_HH var2data var3data
.
.
.

and so on.
now for roster variables, data isn't captured for the respondent.

the common identifier is an ID which is unique to each household

Now, i want data like status in household to be populated against each respondent. How can this be done?

I am really new to stata and strugglinh here.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35700
#8

21 Jan 2020, 05:27

I can't see that #7 bears any relation to the thread title. That being so, please start a new thread with a better title. It would be a really good idea to read https://www.statalist.org/forums/help before you post as a schematic description of your data is less helpful than a concrete example.
1 like
Comment

Household Number	Member Name	Member ID
1	Alex	1
1	Jane	2
1	Sarah	.
2	Ali	1
2	Omar	.

Announcement

add incremental values to existing variable

Comment

Comment

Comment

Comment

Comment

Comment

Comment