Column to row (matching cases)

ajay pasi

Join Date: Jan 2019
Posts: 170

Column to row (matching cases)

31 Oct 2022, 10:53

Stata community is requested to please help me.

I have this data set

Person	Number of toffees Person has	Person's Friend
Jack	2	Ace
Ace	5	Jack
Roger	9	.
Daniel	5	.
Sandy	7	Kale
Kale	17	Sandy
Rogan	11	.
Brian	13	Alex
Alex	1	Brian
Timmon	25	.

I require the data set to be this form.

Person	Number of toffees Person has	Person's Friend	Number of toffees Person's Friend has
Jack	2	Ace	5
Ace	5	Jack	2
Roger	9	.
Daniel	5	.
Sandy	7	Kale	17
Kale	17	Sandy	7
Rogan	11	.
Brian	13	Alex	1
Alex	1	Brian	13
Timmon	25	.

I would appreciate your suggestions to solve this hurdle.

Tags: None

Hemanshu Kumar

Join Date: Mar 2015
Posts: 1397

31 Oct 2022, 11:09

The first part of the following code just recreates your dataset. You should use the part after that:

Code:

clear
input str10 person int toffees str10 friend
"Jack"    2    "Ace"    
"Ace"    5    "Jack"    
"Roger"    9    ""    
"Daniel"    5    ""    
"Sandy"    7    "Kale"    
"Kale"    17    "Sandy"    
"Rogan"    11    ""    
"Brian"    13    "Alex"    
"Alex"    1    "Brian"    
"Timmon"    25    ""    
end

* start here

preserve
    keep person toffees
    rename person friend
    rename toffees friend_toffees
    tempfile toffee
    save `toffee'
restore

gen `c(obs)' n = _n
merge m:1 friend using `toffee', keep(1 3) keepusing(friend_toffees) nogen
sort n
drop n

which produces:

Code:

. list, noobs sep(0)

  +--------------------------------------+
  | person   toffees   friend   friend~s |
  |--------------------------------------|
  |   Jack         2      Ace          5 |
  |    Ace         5     Jack          2 |
  |  Roger         9                   . |
  | Daniel         5                   . |
  |  Sandy         7     Kale         17 |
  |   Kale        17    Sandy          7 |
  |  Rogan        11                   . |
  |  Brian        13     Alex          1 |
  |   Alex         1    Brian         13 |
  | Timmon        25                   . |
  +--------------------------------------+

Comment

ajay pasi

Join Date: Jan 2019

Posts: 170
#3

31 Oct 2022, 11:19

Hi Hemanshu, thank you for your kind help.

However, I don't understand the line below
input str10 person int toffees str10 friend
Comment
Hemanshu Kumar

Join Date: Mar 2015

Posts: 1397
#4

31 Oct 2022, 11:24

Like I said, that first chunk of code was just to create your dataset in memory. I shouldn't have needed to do this, but I had to since you did not post your sample data using the -dataex- command (as advised in the Statalist FAQ). Since you already have your data, you can ignore that whole chunk of code.

To understand how input works, see

Code:

help input
Comment
ajay pasi

Join Date: Jan 2019

Posts: 170
#5

31 Oct 2022, 11:36

Thanks, Hemanshu. I appreciate the exactness of your response.
Comment
ajay pasi

Join Date: Jan 2019

Posts: 170
#6

31 Oct 2022, 12:23

Hi Hemanshu

Sorry to bother you again.

Should the code work if there are two or more "Persons" with the same name (and/or two or more "person's friends" with the same name)?
Comment
Hemanshu Kumar

Join Date: Mar 2015

Posts: 1397
#7

31 Oct 2022, 12:41

No, the code will fail if the names are not unique. Please post a data extract using dataex that includes such cases, as well as some way of distinguishing between people of the same name -- a person ID of some sort, perhaps?
Comment
ajay pasi

Join Date: Jan 2019

Posts: 170
#8

01 Nov 2022, 01:55

Hi Hemanshu, I have reached to this point--> I am able to get the below table.
list, noobs sep(0) +--------------------------------------+ | person toffees friend friend~s | |--------------------------------------| | Jack 2 Ace 5 | | Ace 5 Jack 2 | | Roger 9 . | | Daniel 5 . | | Sandy 7 Kale 17 | | Kale 17 Sandy 7 | | Rogan 11 . | | Brian 13 Alex 1 | | Alex 1 Brian 13 | | Timmon 25 . | +--------------------------------------+
Could you guide me a little further? Now I need red rows to disappear (or turn them into missing values) (see table below)

+--------------------------------------+ | person toffees friend friend~s | |--------------------------------------| | Jack 2 Ace 5 | | Ace 5 Jack 2 | | Roger 9 . | | Daniel 5 . | | Sandy 7 Kale 17 | | Kale 17 Sandy 7 | | Rogan 11 . | | Brian 13 Alex 1 | | Alex 1 Brian 13 | | Timmon 25 . | +--------------------------------------+ I would appreciate your help.
Comment
ajay pasi

Join Date: Jan 2019

Posts: 170
#9

01 Nov 2022, 02:08

Sorry for the formatting errors.

I have found table ". list, noobs sep(0)." Thanks to you.
Could you guide me a little further? Now I need red rows to disappear (or turn them into missing values) (see table below)
person toffees friend freind~s

Jack 2 Ace 5

Ace 5 Jack 2

Roger 9

Daniel 5

Sandy 7 Kale 17

Kale 17 Sandy 7

Rogan 11

Brian 13 Alex 1

Alex 1 Brian 13

Timmon 25
Comment
Hemanshu Kumar

Join Date: Mar 2015

Posts: 1397
#10

01 Nov 2022, 02:10

ajay pasi your post is hard to decipher because the output has got messed up. Please use the CODE delimiters (use the # button on the toolbar on the text entry box) to enclose your output.

You will also need to explain what the problem is with the highlighted rows, and what is the more general feature that can be used to identify what to drop.

Edit: cross-posted with #9. While #9 is an improvement, you can just use the CODE delimiters instead of trying to format it as a table. And I still don't understand what the problem is with the highlighted rows and how to identify in more general terms, the ones that you want dropped.

Last edited by Hemanshu Kumar; 01 Nov 2022, 02:13.
Comment
ajay pasi

Join Date: Jan 2019

Posts: 170
#11

01 Nov 2022, 02:49

Sorry for the inconvenience.

Unique ID Person Person's toffees Unique ID Friend Friend's toffees

1 2 2 5

2 5 1 2

3 9

4 5

5 7 6 17

6 17 5 7

7 11

8 13 9 1

9 1 8 13

10 25

I want the combination of ("Unique ID Person", Unique ID Friend") without replacement, including "Unique ID Person" that don't have friend. Which means I need to eliminate the red rows.
Comment

Hemanshu Kumar

Join Date: Mar 2015
Posts: 1397

#12

01 Nov 2022, 03:24

Code:

clear
input int(person_id toffees friend_id f_toffees)
1    2    2    5
2    5    1    2
3    9    .   .    
4    5    .    .
5    7    6    17
6    17    5    7
7    11    .    .
8    13    9    1
9    1    8    13
10    25    .    .    
end

gen `c(obs)' n = _n

egen min_id = rowmin(person_id friend_id)
egen max_id = rowmax(person_id friend_id)

bysort min_id max_id (person_id): keep if _n == 1 // within each combination, this retains only the observation with the lowest person_id

sort n
drop min_id max_id n

which produces:

Code:

. list , noobs sep(0)

  +------------------------------------------+
  | person~d   toffees   friend~d   f_toff~s |
  |------------------------------------------|
  |        1         2          2          5 |
  |        3         9          .          . |
  |        4         5          .          . |
  |        5         7          6         17 |
  |        7        11          .          . |
  |        8        13          9          1 |
  |       10        25          .          . |
  +------------------------------------------+

Last edited by Hemanshu Kumar; 01 Nov 2022, 03:32.

Comment

ajay pasi

Join Date: Jan 2019

Posts: 170
#13

01 Nov 2022, 04:26

Thanks, Hemanshu. Appreciate your help.
Comment
ajay pasi

Join Date: Jan 2019

Posts: 170
#14

01 Nov 2022, 05:53

Hi Hemanshu,

could you please tell me what this code means

merge m:1 friend using `toffee', keep(1 3) keepusing(friend_toffees) nogen //

specifically, keep (1 3) part of the code.

Last edited by ajay pasi; 01 Nov 2022, 05:56.
Comment
Hemanshu Kumar

Join Date: Mar 2015

Posts: 1397
#15

01 Nov 2022, 05:59

Take a look at

Code:

help merge##results

. I am asking it to keep only some of the merge results -- the ones in the master data only, and the matched ones. In other words, it should not try and keep any unmatched observations from the using data.
Comment

Announcement

Column to row (matching cases)

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment