Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Counting number of times an ID value appears in a list of variables

    Hello,

    I am trying to count the number of times a certain respondent is mentioned by another respondent as a friend using ego-centric data. Also, I should mention that due to specific circumstances, I am unable to download specific stata packages.

    Currently, my data is formatted as follows (on a much larger scale):
    ID Friend 1 Friend 2 Friend 3
    1 2 3 4
    2 3 5 1
    3 4 2 1
    4 1 4 2
    5 1 3 2

    My goal is to count the total amount of times an ID number was listed among all friend variables and give it to the respective ID row:

    ID Friend 1 Friend 2 Friend 3 Count
    1 2 3 4 4
    2 3 5 1 4
    3 4 2 1 3
    4 1 4 2 3
    5 1 3 2 1

    Any help would be appreciated on how to code this.

    Thank you for your time!
    Daniel

  • #2
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte(id friend1 friend2 friend3)
    1 2 3 4
    2 3 5 1
    3 4 2 1
    4 1 4 2
    5 1 3 2
    end
    
    isid id, sort
    levelsof id, local(ids)
    reshape long friend, i(id)
    
    //    ELIMINATE SELF-REFERENCES AS FRIENDS
    drop if friend == id
    //    ELIMINATE ANY DPLICATE REFERENCES AS FRIENDS
    duplicates drop
    
    //    COUNT UP FRIEND REFERENCES
    gen friend_count = .
    foreach i of local ids {
        count if friend == `i'
        replace friend_count = r(N) if id == `i'
    }
    
    reshape wide
    Notes: id 4 has listed him/herself as a friend. I assume this is not really permissible, so in the code I have excluded self-references. And, although there are no instances of this in the example data shown, I also assume it is not permissible for an id to list the same person more than once as a friend. So the code will eliminate such redundancies if it encounters any. Finally, as per your restrictions on installing packages, this code uses only official Stata commands.

    In the future, when showing data examples, please use the -dataex- command to do so, as I have done here. If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.


    Comment


    • #3
      Hi Clyde,

      I appreciate you help on this. I ran the code you suggested and seem to be running into a problem. The code runs without an error message, but the friend_count variable only has missing cases even after running the foreach command. Would you happen to know any reason why this may be the case?

      Comment


      • #4
        I think you are running the code incorrectly. I tested this code before posting it, and it definitely doesn't leave the variable friend_count empty.

        Probably what's happening is that you are trying to run the code line-by-line or in small sections. Because the code uses local macros, you can't do that. When you interrupt the code execution, any local macros that were defined in what you ran disappear and they are then unavailable in the next line or section you try. So, for example, if the code is interrupted somewhere between the -levelsof- command and the -foreach- loop, local macro ids will not exist when the -foreach- is run. Non-existent local macros are interpreted as empty strings. So -foreach- will think that is being asked to repeat the code over no values of id at all, and will, therefore, do nothing. That would leave all the values of friend_count as missing.

        Run the entire thing in one fell swoop, without interruption from beginning to end. I'm pretty sure that will solve your problem.

        In the event it does not, please post back with new example data that reproduces this difficulty and I will troubleshoot it. Please be sure to use the -dataex- command if and when you do that.

        Comment


        • #5
          Clyde,

          Not running all the code together was definitely the problem. I was working under the assumption that local macros were saved across commands, but I was incorrect. Thanks you so much for your help.

          Comment


          • #6
            Returning to this a couple years later with a new problem. My new goal is to count friendship nominations only if the friend who received a nomination also sent a nomination to the original sender. In other words (referencing the data below), I only want to count person #2 as having two friends because while persons #3, #4, and #5 sent a nomination to #2, person #2 only nominated #3 and #5 as friends.

            Below is the code I've run but I don't think this is counting correctly. Any help would be appreciated!

            Code:
             * Example generated by -dataex-. For more info, type help dataex
            clear input byte(id friend1 friend2 friend3)
            1 3 4 5
            2 1 3 5
            3 1 2 4
            4 1 2 3
            5 2 3 4
            end  
            
            isid id, sort
            levelsof id, local(ids)
            reshape long friend, i(id)  
            
            //    COUNT UP RECIPROCAL FRIEND REFERENCES
            gen reciprocal_friend_count = .
            foreach i of local ids {
                count if friend == `i' & friend == id
                replace reciprocal_friend_count = r(N) if id == `i'
            }
            
             reshape wide
            Last edited by Daniel Jang; 27 May 2026, 11:37.

            Comment


            • #7
              I believe this does it:
              Code:
               * Example generated by -dataex-. For more info, type help dataex
              clear 
              input byte(id friend1 friend2 friend3)
              1 3 4 5
              2 1 3 5
              3 1 2 4
              4 1 2 3
              5 2 3 4
              end  
              
              reshape long friend, i(id) 
              sort id friend
              drop _j
              preserve
              rename id id2
              rename friend id
              rename id2 friend
              tempfile reverse
              save `reverse'
              list, noobs clean
              restore
              
              merge 1:1 id friend using `reverse', keep(master match)
              by id (friend), sort: egen n_mutual = total(_merge == 3)

              Comment


              • #8
                Clyde,

                Thank you again for your help!

                The only issue with the code I'm having is with the merge command reporting that the id variable does not uniquely identify observations in both the master and using data...I assume because this is because the data is in long form?
                Would doing an "m:m" merge instead of a 1:1 merge solve the issue?

                Comment


                • #9
                  Would doing an "m:m" merge instead of a 1:1 merge solve the issue?
                  Definitely not! -merge m:m- is almost never the correct approach to anything. I don't know why Stata even allows -merge m:m-. It just produces data salad (unless the data would correctly merge -1:m-, -m:1-, or -m:1-).

                  First, I wonder if you have implemented the code correctly. The code I show in #7 does not require that variable id uniquely identify observations in either data set. The command requires that the combination of id and friend uniquely identify observations in both data sets. Did you forget to include friend when you typed the -merge- command?

                  If you did correctly specify both id and friend in the -merge- command then the problem is with your data. If id and friend do not uniquely identify observations in the long version of the data, it means that in the original wide version of the data, some id specified the same person as a friend more than once. That makes no sense. Now, there is an easy fix for this: just add -duplicates drop- to the code immediately after the -reshape long- command. That will enable you to run the code.

                  But I don't really recommend doing the easy fix. It is conceptually wrong for the original wide data to have the same id mention the same person as friend more than once. Those occurrences are probably data errors. The better fix is to go back over the data management that produced your data set and fix that code so that those duplicate mentions of the same friend don't appear in the data set in the first place. In the course of doing that, you might stumble over other errors in the data management, and you should fix those as well. Once you have a correct, well-formed, wide data set, the code in #7 will solve your problem without any modification.

                  Comment

                  Working...
                  X