Counting number of times an ID value appears in a list of variables

Daniel Jang

Join Date: Sep 2024

Posts: 5
#1

Counting number of times an ID value appears in a list of variables

18 Sep 2024, 17:09

Hello,

I am trying to count the number of times a certain respondent is mentioned by another respondent as a friend using ego-centric data. Also, I should mention that due to specific circumstances, I am unable to download specific stata packages.

Currently, my data is formatted as follows (on a much larger scale):
ID Friend 1 Friend 2 Friend 3
1 2 3 4
2 3 5 1
3 4 2 1
4 1 4 2
5 1 3 2

My goal is to count the total amount of times an ID number was listed among all friend variables and give it to the respective ID row:

ID Friend 1 Friend 2 Friend 3 Count
1 2 3 4 4
2 3 5 1 4
3 4 2 1 3
4 1 4 2 3
5 1 3 2 1

Any help would be appreciated on how to code this.

Thank you for your time!
Daniel
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30355
#2

18 Sep 2024, 19:10

Code:

* Example generated by -dataex-. For more info, type help dataex clear input byte(id friend1 friend2 friend3) 1 2 3 4 2 3 5 1 3 4 2 1 4 1 4 2 5 1 3 2 end isid id, sort levelsof id, local(ids) reshape long friend, i(id) // ELIMINATE SELF-REFERENCES AS FRIENDS drop if friend == id // ELIMINATE ANY DPLICATE REFERENCES AS FRIENDS duplicates drop // COUNT UP FRIEND REFERENCES gen friend_count = . foreach i of local ids { count if friend == `i' replace friend_count = r(N) if id == `i' } reshape wide

Notes: id 4 has listed him/herself as a friend. I assume this is not really permissible, so in the code I have excluded self-references. And, although there are no instances of this in the example data shown, I also assume it is not permissible for an id to list the same person more than once as a friend. So the code will eliminate such redundancies if it encounters any. Finally, as per your restrictions on installing packages, this code uses only official Stata commands.

In the future, when showing data examples, please use the -dataex- command to do so, as I have done here. If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.
Comment
Daniel Jang

Join Date: Sep 2024

Posts: 5
#3

20 Sep 2024, 10:54

Hi Clyde,

I appreciate you help on this. I ran the code you suggested and seem to be running into a problem. The code runs without an error message, but the friend_count variable only has missing cases even after running the foreach command. Would you happen to know any reason why this may be the case?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30355
#4

20 Sep 2024, 14:45

I think you are running the code incorrectly. I tested this code before posting it, and it definitely doesn't leave the variable friend_count empty.

Probably what's happening is that you are trying to run the code line-by-line or in small sections. Because the code uses local macros, you can't do that. When you interrupt the code execution, any local macros that were defined in what you ran disappear and they are then unavailable in the next line or section you try. So, for example, if the code is interrupted somewhere between the -levelsof- command and the -foreach- loop, local macro ids will not exist when the -foreach- is run. Non-existent local macros are interpreted as empty strings. So -foreach- will think that is being asked to repeat the code over no values of id at all, and will, therefore, do nothing. That would leave all the values of friend_count as missing.

Run the entire thing in one fell swoop, without interruption from beginning to end. I'm pretty sure that will solve your problem.

In the event it does not, please post back with new example data that reproduces this difficulty and I will troubleshoot it. Please be sure to use the -dataex- command if and when you do that.
Comment
Daniel Jang

Join Date: Sep 2024

Posts: 5
#5

25 Sep 2024, 11:14

Clyde,

Not running all the code together was definitely the problem. I was working under the assumption that local macros were saved across commands, but I was incorrect. Thanks you so much for your help.
Comment
Daniel Jang

Join Date: Sep 2024

Posts: 5
#6

27 May 2026, 11:35

Returning to this a couple years later with a new problem. My new goal is to count friendship nominations only if the friend who received a nomination also sent a nomination to the original sender. In other words (referencing the data below), I only want to count person #2 as having two friends because while persons #3, #4, and #5 sent a nomination to #2, person #2 only nominated #3 and #5 as friends.

Below is the code I've run but I don't think this is counting correctly. Any help would be appreciated!

Code:

* Example generated by -dataex-. For more info, type help dataex clear input byte(id friend1 friend2 friend3) 1 3 4 5 2 1 3 5 3 1 2 4 4 1 2 3 5 2 3 4 end isid id, sort levelsof id, local(ids) reshape long friend, i(id) // COUNT UP RECIPROCAL FRIEND REFERENCES gen reciprocal_friend_count = . foreach i of local ids { count if friend == `i' & friend == id replace reciprocal_friend_count = r(N) if id == `i' } reshape wide

Last edited by Daniel Jang; 27 May 2026, 11:37.
Comment

Clyde Schechter

Join Date: Apr 2014
Posts: 30355

27 May 2026, 12:32

I believe this does it:

Code:

 * Example generated by -dataex-. For more info, type help dataex
clear 
input byte(id friend1 friend2 friend3)
1 3 4 5
2 1 3 5
3 1 2 4
4 1 2 3
5 2 3 4
end  

reshape long friend, i(id) 
sort id friend
drop _j
preserve
rename id id2
rename friend id
rename id2 friend
tempfile reverse
save `reverse'
list, noobs clean
restore

merge 1:1 id friend using `reverse', keep(master match)
by id (friend), sort: egen n_mutual = total(_merge == 3)

Comment

Daniel Jang

Join Date: Sep 2024

Posts: 5
#8

09 Jun 2026, 10:22

Clyde,

Thank you again for your help!

The only issue with the code I'm having is with the merge command reporting that the id variable does not uniquely identify observations in both the master and using data...I assume because this is because the data is in long form?
Would doing an "m:m" merge instead of a 1:1 merge solve the issue?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30355
#9

09 Jun 2026, 11:20

Would doing an "m:m" merge instead of a 1:1 merge solve the issue?

Definitely not! -merge m:m- is almost never the correct approach to anything. I don't know why Stata even allows -merge m:m-. It just produces data salad (unless the data would correctly merge -1:m-, -m:1-, or -m:1-).

First, I wonder if you have implemented the code correctly. The code I show in #7 does not require that variable id uniquely identify observations in either data set. The command requires that the combination of id and friend uniquely identify observations in both data sets. Did you forget to include friend when you typed the -merge- command?

If you did correctly specify both id and friend in the -merge- command then the problem is with your data. If id and friend do not uniquely identify observations in the long version of the data, it means that in the original wide version of the data, some id specified the same person as a friend more than once. That makes no sense. Now, there is an easy fix for this: just add -duplicates drop- to the code immediately after the -reshape long- command. That will enable you to run the code.

But I don't really recommend doing the easy fix. It is conceptually wrong for the original wide data to have the same id mention the same person as friend more than once. Those occurrences are probably data errors. The better fix is to go back over the data management that produced your data set and fix that code so that those duplicate mentions of the same friend don't appear in the data set in the first place. In the course of doing that, you might stumble over other errors in the data management, and you should fix those as well. Once you have a correct, well-formed, wide data set, the code in #7 will solve your problem without any modification.
Comment

Announcement

Counting number of times an ID value appears in a list of variables

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment