Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Column to row (matching cases)


    Stata community is requested to please help me.

    I have this data set
    Person Number of toffees
    Person has
    Person's Friend
    Jack 2 Ace
    Ace 5 Jack
    Roger 9 .
    Daniel 5 .
    Sandy 7 Kale
    Kale 17 Sandy
    Rogan 11 .
    Brian 13 Alex
    Alex 1 Brian
    Timmon 25 .



    I require the data set to be this form.
    Person Number of toffees
    Person has
    Person's Friend Number of toffees Person's Friend has
    Jack 2 Ace 5
    Ace 5 Jack 2
    Roger 9 .
    Daniel 5 .
    Sandy 7 Kale 17
    Kale 17 Sandy 7
    Rogan 11 .
    Brian 13 Alex 1
    Alex 1 Brian 13
    Timmon 25 .

    I would appreciate your suggestions to solve this hurdle.


  • #2
    The first part of the following code just recreates your dataset. You should use the part after that:

    Code:
    clear
    input str10 person int toffees str10 friend
    "Jack"    2    "Ace"    
    "Ace"    5    "Jack"    
    "Roger"    9    ""    
    "Daniel"    5    ""    
    "Sandy"    7    "Kale"    
    "Kale"    17    "Sandy"    
    "Rogan"    11    ""    
    "Brian"    13    "Alex"    
    "Alex"    1    "Brian"    
    "Timmon"    25    ""    
    end
    
    * start here
    
    preserve
        keep person toffees
        rename person friend
        rename toffees friend_toffees
        tempfile toffee
        save `toffee'
    restore
    
    gen `c(obs)' n = _n
    merge m:1 friend using `toffee', keep(1 3) keepusing(friend_toffees) nogen
    sort n
    drop n
    which produces:
    Code:
    . list, noobs sep(0)
    
      +--------------------------------------+
      | person   toffees   friend   friend~s |
      |--------------------------------------|
      |   Jack         2      Ace          5 |
      |    Ace         5     Jack          2 |
      |  Roger         9                   . |
      | Daniel         5                   . |
      |  Sandy         7     Kale         17 |
      |   Kale        17    Sandy          7 |
      |  Rogan        11                   . |
      |  Brian        13     Alex          1 |
      |   Alex         1    Brian         13 |
      | Timmon        25                   . |
      +--------------------------------------+

    Comment


    • #3
      Hi Hemanshu, thank you for your kind help.

      However, I don't understand the line below
      input str10 person int toffees str10 friend

      Comment


      • #4
        Like I said, that first chunk of code was just to create your dataset in memory. I shouldn't have needed to do this, but I had to since you did not post your sample data using the -dataex- command (as advised in the Statalist FAQ). Since you already have your data, you can ignore that whole chunk of code.

        To understand how input works, see
        Code:
        help input

        Comment


        • #5
          Thanks, Hemanshu. I appreciate the exactness of your response.

          Comment


          • #6
            Hi Hemanshu

            Sorry to bother you again.

            Should the code work if there are two or more "Persons" with the same name (and/or two or more "person's friends" with the same name)?

            Comment


            • #7
              No, the code will fail if the names are not unique. Please post a data extract using dataex that includes such cases, as well as some way of distinguishing between people of the same name -- a person ID of some sort, perhaps?

              Comment


              • #8
                Hi Hemanshu, I have reached to this point--> I am able to get the below table.
                list, noobs sep(0) +--------------------------------------+ | person toffees friend friend~s | |--------------------------------------| | Jack 2 Ace 5 | | Ace 5 Jack 2 | | Roger 9 . | | Daniel 5 . | | Sandy 7 Kale 17 | | Kale 17 Sandy 7 | | Rogan 11 . | | Brian 13 Alex 1 | | Alex 1 Brian 13 | | Timmon 25 . | +--------------------------------------+
                Could you guide me a little further? Now I need red rows to disappear (or turn them into missing values) (see table below)

                +--------------------------------------+ | person toffees friend friend~s | |--------------------------------------| | Jack 2 Ace 5 | | Ace 5 Jack 2 | | Roger 9 . | | Daniel 5 . | | Sandy 7 Kale 17 | | Kale 17 Sandy 7 | | Rogan 11 . | | Brian 13 Alex 1 | | Alex 1 Brian 13 | | Timmon 25 . | +--------------------------------------+ I would appreciate your help.

                Comment


                • #9
                  Sorry for the formatting errors.

                  I have found table ". list, noobs sep(0)." Thanks to you.
                  Could you guide me a little further? Now I need red rows to disappear (or turn them into missing values) (see table below)
                  person toffees friend freind~s
                  Jack 2 Ace 5
                  Ace 5 Jack 2
                  Roger 9
                  Daniel 5
                  Sandy 7 Kale 17
                  Kale 17 Sandy 7
                  Rogan 11
                  Brian 13 Alex 1
                  Alex 1 Brian 13
                  Timmon 25

                  Comment


                  • #10
                    ajay pasi your post is hard to decipher because the output has got messed up. Please use the CODE delimiters (use the # button on the toolbar on the text entry box) to enclose your output.

                    You will also need to explain what the problem is with the highlighted rows, and what is the more general feature that can be used to identify what to drop.

                    Edit: cross-posted with #9. While #9 is an improvement, you can just use the CODE delimiters instead of trying to format it as a table. And I still don't understand what the problem is with the highlighted rows and how to identify in more general terms, the ones that you want dropped.
                    Last edited by Hemanshu Kumar; 01 Nov 2022, 02:13.

                    Comment


                    • #11
                      Sorry for the inconvenience.
                      Unique ID Person Person's toffees Unique ID Friend Friend's toffees
                      1 2 2 5
                      2 5 1 2
                      3 9
                      4 5
                      5 7 6 17
                      6 17 5 7
                      7 11
                      8 13 9 1
                      9 1 8 13
                      10 25
                      I want the combination of ("Unique ID Person", Unique ID Friend") without replacement, including "Unique ID Person" that don't have friend. Which means I need to eliminate the red rows.


                      Comment


                      • #12
                        Code:
                        clear
                        input int(person_id toffees friend_id f_toffees)
                        1    2    2    5
                        2    5    1    2
                        3    9    .   .    
                        4    5    .    .
                        5    7    6    17
                        6    17    5    7
                        7    11    .    .
                        8    13    9    1
                        9    1    8    13
                        10    25    .    .    
                        end
                        
                        gen `c(obs)' n = _n
                        
                        egen min_id = rowmin(person_id friend_id)
                        egen max_id = rowmax(person_id friend_id)
                        
                        bysort min_id max_id (person_id): keep if _n == 1 // within each combination, this retains only the observation with the lowest person_id
                        
                        sort n
                        drop min_id max_id n
                        which produces:
                        Code:
                        . list , noobs sep(0)
                        
                          +------------------------------------------+
                          | person~d   toffees   friend~d   f_toff~s |
                          |------------------------------------------|
                          |        1         2          2          5 |
                          |        3         9          .          . |
                          |        4         5          .          . |
                          |        5         7          6         17 |
                          |        7        11          .          . |
                          |        8        13          9          1 |
                          |       10        25          .          . |
                          +------------------------------------------+
                        Last edited by Hemanshu Kumar; 01 Nov 2022, 03:32.

                        Comment


                        • #13
                          Thanks, Hemanshu. Appreciate your help.

                          Comment


                          • #14
                            Hi Hemanshu,

                            could you please tell me what this code means

                            merge m:1 friend using `toffee', keep(1 3) keepusing(friend_toffees) nogen //


                            specifically, keep (1 3) part of the code.
                            Last edited by ajay pasi; 01 Nov 2022, 05:56.

                            Comment


                            • #15
                              Take a look at
                              Code:
                              help merge##results
                              . I am asking it to keep only some of the merge results -- the ones in the master data only, and the matched ones. In other words, it should not try and keep any unmatched observations from the using data.

                              Comment

                              Working...
                              X