Hello,
I would like to perform a left join that involves two datafiles. Because I am, due to contractual reasons, not allowed to discuss the actual datasets, I will use imaginary datafiles to illustrate my question.
Datafile 1: Parents.dta
Datafile 2: Children.dta
Note that each parent only can have one child, or none, but not multiple. Similarly, a child can only have one or none parents, but not multiple.
This yields the following datasets:
Parents.dta:
- ID (unique identifier)
- Name (String)
- Child ID (unique identifier for exactly 1 child, or no ID if no child)
Children.dta:
- ID (unique identifier for child, Parents.dta refers to this ID in Child ID)
- Child name (String)
Examples of data:
Parents.dta:
ID Name Child ID
1 Richard .
2 Simon 1
Children.dta:
ID Child name
1 Vanessa
2 Tim
I want to join Child ID in Parents.dta to the ID in Children.dta, but only keep the original records in Parents.dta. Then I want to include Child Name in the new datafile.
This should result in the following datafile:
ID Name Child ID Child Name
1 Richard . .
2 Simon 1 Tim
Note that child Vanessa is not included here.
How could I perform such Left Join ? Do you perhaps have any syntax that can help me understand the reasoning?
Thank you very much!
I would like to perform a left join that involves two datafiles. Because I am, due to contractual reasons, not allowed to discuss the actual datasets, I will use imaginary datafiles to illustrate my question.
Datafile 1: Parents.dta
Datafile 2: Children.dta
Note that each parent only can have one child, or none, but not multiple. Similarly, a child can only have one or none parents, but not multiple.
This yields the following datasets:
Parents.dta:
- ID (unique identifier)
- Name (String)
- Child ID (unique identifier for exactly 1 child, or no ID if no child)
Children.dta:
- ID (unique identifier for child, Parents.dta refers to this ID in Child ID)
- Child name (String)
Examples of data:
Parents.dta:
ID Name Child ID
1 Richard .
2 Simon 1
Children.dta:
ID Child name
1 Vanessa
2 Tim
I want to join Child ID in Parents.dta to the ID in Children.dta, but only keep the original records in Parents.dta. Then I want to include Child Name in the new datafile.
This should result in the following datafile:
ID Name Child ID Child Name
1 Richard . .
2 Simon 1 Tim
Note that child Vanessa is not included here.
How could I perform such Left Join ? Do you perhaps have any syntax that can help me understand the reasoning?
Thank you very much!
Comment