Hello All,
I am trying to merge two data sets by id. The first is a long data set, containing multiple instances per id, this has 27,384 observations. The second data set contains only one instance of each id with 1,697 observations. Each data set contains different data, the long dataset contains school enrollment information, the second contains student characteristics like class rank, gpa, sex, etc. I tried
and receive error: variable randomid does not uniquely identify observations in the master data
My unique identifier is randomid. I have already verified that each of the 1,697 ids in the shorter data set appear in the long data set.
Data set 1 (27,384 observations)
Dataset 2 (1,697 observations)
I do not want to drop any of the repeated ids since each line contains different data that is needed for my analysis. I am wondering how to merge these two datasets or if perhaps it needs to be reshaped first? Help is greatly appreciated
I am trying to merge two data sets by id. The first is a long data set, containing multiple instances per id, this has 27,384 observations. The second data set contains only one instance of each id with 1,697 observations. Each data set contains different data, the long dataset contains school enrollment information, the second contains student characteristics like class rank, gpa, sex, etc. I tried
Code:
merge 1:m randomid using YUC_Stu_Characteristics.dta
My unique identifier is randomid. I have already verified that each of the 1,697 ids in the shorter data set appear in the long data set.
Data set 1 (27,384 observations)
Code:
* Example generated by -dataex-. For more info, type help dataex clear input double randomid long high_school_grad_date str7 publicprivate str17 year4year str2 college_state 2103300231 20220608 "Private" "4-year" "CA" 2103300237 20220608 "" "" "" 2103300243 20220608 "" "" "" 2103300255 20200603 "Private" "4-year" "CA" 2103300255 20200603 "Private" "4-year" "CA" 2103300255 20200603 "Private" "4-year" "CA" 2103300255 20200603 "Public" "2-year" "CA" 2103300255 20200603 "Private" "4-year" "CA" 2103300255 20200603 "Public" "2-year" "CA" 2103300255 20200603 "Private" "4-year" "CA" end
Code:
* Example generated by -dataex-. For more info, type help dataex clear input double randomid int classrank str1 sex 3342021132 87 "M" 4383032232 81 "M" 4719000696 20 "M" 4716004272 39 "M" end
Comment