Hi,
Here is a simplified version of my data:
where _id and _nn respectively have no duplicates.
I am looking for a way in which Stata could do the followings:
1.Create a new variable (for example "NewYear")
2. Then replace the variable "NewYear" with the value from "Year" if Stata could find overlapping/common values from variable "_nn" and "_id".
For example, in row 1, the variable "NewYear" would be 2006 since the value of variable _nn contains the same value of 51805 with row 2 _id.
3. For observations with no common values between _nn and _id, variable "NewYear" would be missing.
4. For observations with value in "Year" variable, variable "NewYear" will take the value from "Year" variable
The expected output is as follows:
Where "Comments" is just for explanation purpose only.
I have also refereed to http://www.stata.com/statalist/archive/2013-06/msg00057.html and http://www.stata.com/statalist/archi.../msg00508.html . But they seem not to be what I am looking for.
Would appreciate if someone could assist me on this and I would improve my question if it is not clear enough.
Thanks in advance.
Regards
Here is a simplified version of my data:
| Row | Year | _id | _nn |
| 1 | 39011 | 51805 | |
| 2 | 2006 | 51805 | 77823 |
| 3 | 11921 | 33111 | |
| 4 | 2008 | 77823 | 69173 |
| 5 | 14283 | 77823 |
I am looking for a way in which Stata could do the followings:
1.Create a new variable (for example "NewYear")
2. Then replace the variable "NewYear" with the value from "Year" if Stata could find overlapping/common values from variable "_nn" and "_id".
For example, in row 1, the variable "NewYear" would be 2006 since the value of variable _nn contains the same value of 51805 with row 2 _id.
3. For observations with no common values between _nn and _id, variable "NewYear" would be missing.
4. For observations with value in "Year" variable, variable "NewYear" will take the value from "Year" variable
The expected output is as follows:
| Row | Year | _id | _nn | NewYear | Comments |
| 1 | 39011 | 51805 | 2006 | Taken from Row 2 | |
| 2 | 2006 | 51805 | 77823 | 2006 | No change since it got data in "Year" variable |
| 3 | 11921 | 33111 | . | Missing value since no common "_nn" with "_id" throughout the sample | |
| 4 | 2008 | 77823 | 69173 | 2008 | No change since it got data in "Year" variable |
| 5 | 14283 | 77823 | 2008 | Taken from Row 4 |
I have also refereed to http://www.stata.com/statalist/archive/2013-06/msg00057.html and http://www.stata.com/statalist/archi.../msg00508.html . But they seem not to be what I am looking for.
Would appreciate if someone could assist me on this and I would improve my question if it is not clear enough.
Thanks in advance.
Regards

Comment