Advanced generation of a new variable

Salini Kh

Join Date: Jan 2019

Posts: 9
#1

Advanced generation of a new variable

21 Jan 2019, 04:45

Hello everyone,

I’m stuck on a problem regarding generation of a new variable.
My original dataset looks like this.

Offer ID Points Accept ID

3456 2

6789 6 3456

5678 6

8760 3 5678

5672 4 6789

Each Offer ID is uniquely assigned a Point. And some offer ID are uniquely assigned an Accept ID. Individuals from the Offer ID can be also Accept the offers made by individuals in the Offer ID. Therefore, the Accept IDs are from the column of Offer IDs.

I’d like to generate a new variable “Acceptor Points” which assigns the Points for the Acceptor IDs. The Acceptor Points is taken from the Points column with respect to the Acceptor IDs.

Ideally my dataset should look like this.

Offer ID Points Accept ID Acceptor Points

3456 2

6789 6 3456 2

5678 6

8760 3 5678 6

5672 4 6789 6

How would I go about coding the new variable “Acceptor Points”?
Tags: None
Jesse Wursten

Join Date: Jan 2016

Posts: 915
#2

21 Jan 2019, 09:09

Save your data, then drop acceptID, rename offerID to acceptID and save that version of the data again. Load the original dataset, rename points to "offerPoints" merge in the altered dataset on the acceptID variable. This will bring in the points from the acceptIDs. Rename that variable to "acceptPoints".
1 like
Comment
Philip Matthews

Join Date: Apr 2014

Posts: 23
#3

21 Jan 2019, 11:24

Another approach is to form a matrix out of your data, then search through the rows/columns for the match you want and create a vector of these values. Finally, write the vector to the data as a new variable. Here is code you can enter as a do-file. Note that I have changed the variable names to lowercase and stripped out spaces. Your variable "Acceptor Points" will appear in your data set with label acceptsvar1. Potential problems: if you have a huge data set this approach may fail owing to limits on sizes of matrices; the code assumes the columns you display in your post really are the first three columns in your data set. If not you will have to modify the code, or temporarily drop the other variables. Make sure you first try this on a copy of your data.

Code:

mat drop _all mkmat offerid points acceptid, matrix(tocheck) // Create a matrix from the data local en = _N // Get the number of records matrix accpoints = J(`en',1,.) // Create a vector to hold the matches forvalues i = 1/`en' { // Go through rows of offerid forvalues k = 1/`en' { // And through rows of acceptid if tocheck[`k',3] == tocheck[`i',1] { mat accpoints[`k',1] = tocheck[`i',2] } } } svmat int accpoints, name(acceptsvar)
1 like
Comment


Offer ID	Points	Accept ID

3456	2
6789	6	3456
5678	6
8760	3	5678
5672	4	6789


Offer ID	Points	Accept ID	Acceptor Points

3456	2
6789	6	3456	2
5678	6
8760	3	5678	6
5672	4	6789	6

Announcement

Advanced generation of a new variable

Comment

Comment