Hi,
I would like to know if there is an easy/easier and more systematic way of merging my datasets (m:m) so that outcomes are med not only within ID but also within dates - I hope this example helps:
I have several outcomes that I would like to have in the same dataset. They are all measured multiple times within ID, some every day, some once a week, some once a month.
An example (sorry I don't have access to my data right now, so can't use dataex)
Outcome 1:
ID outcome1 date1
1 56 07sep22
1 87 08sep22
1 76 09sep22
1 89 10sep22
1 63 11sep22
1 76 12 sep22
(etc...)
2 54 07sep22
2 76 08sep22
Outcome 2:
ID outcome2 date2
1 5 07sep22
1 6 12sep22
1 1 19sep22
1 3 26sep22
(etc...)
2 6 7sep22
2 9 12sep22
Is there a way in which I can merge on date within ID so to speak, so that it merges like this:
ID outome1 date1 outcome 2
1 56 07sep22 5
1 87 08sep22 .
1 76 09sep22 .
1 89 10sep22 .
1 63 11sep22 .
1 76 12 sep22 6
etc...
2 54 07sep22 6
2 76 08sep22 .
Currently I can only make them merge like this:
ID outcome1 date1 outcome 2 date2
1 56 07sep22 5 07sep22
1 87 08sep22 6 12sep22
1 76 09sep22 1 19sep22
1 89 10sep22 3 16sep22
1 63 11sep22 .
1 76 12 sep22 .
etc...
2 54 07sep22
2 76 08sep22
So essentially I would like for there to be a row for each observation of outcome1 and for the observations of outcome2 to be inserted on the corresponding date - and for outcome 2 to otherwise be missing.
A related question: I find it confusing when missing values are propagated (I think is the term) with the last observed value when merging m:m, eg if I have 50 observations for outcome1 ID 1 but only 30 on outcome2, stata will just 'repeat' the values of the last row of the outcome 2 dataset when I merge it with outcome1 - is there are way to avoid this? Or should they always be nonmissing when I need to do multilevel mixed-effects linear regression (mixed-command)?
Thank you very much in advance - if helpful I can try to produce a data example.
I would like to know if there is an easy/easier and more systematic way of merging my datasets (m:m) so that outcomes are med not only within ID but also within dates - I hope this example helps:
I have several outcomes that I would like to have in the same dataset. They are all measured multiple times within ID, some every day, some once a week, some once a month.
An example (sorry I don't have access to my data right now, so can't use dataex)
Outcome 1:
ID outcome1 date1
1 56 07sep22
1 87 08sep22
1 76 09sep22
1 89 10sep22
1 63 11sep22
1 76 12 sep22
(etc...)
2 54 07sep22
2 76 08sep22
Outcome 2:
ID outcome2 date2
1 5 07sep22
1 6 12sep22
1 1 19sep22
1 3 26sep22
(etc...)
2 6 7sep22
2 9 12sep22
Is there a way in which I can merge on date within ID so to speak, so that it merges like this:
ID outome1 date1 outcome 2
1 56 07sep22 5
1 87 08sep22 .
1 76 09sep22 .
1 89 10sep22 .
1 63 11sep22 .
1 76 12 sep22 6
etc...
2 54 07sep22 6
2 76 08sep22 .
Currently I can only make them merge like this:
ID outcome1 date1 outcome 2 date2
1 56 07sep22 5 07sep22
1 87 08sep22 6 12sep22
1 76 09sep22 1 19sep22
1 89 10sep22 3 16sep22
1 63 11sep22 .
1 76 12 sep22 .
etc...
2 54 07sep22
2 76 08sep22
So essentially I would like for there to be a row for each observation of outcome1 and for the observations of outcome2 to be inserted on the corresponding date - and for outcome 2 to otherwise be missing.
A related question: I find it confusing when missing values are propagated (I think is the term) with the last observed value when merging m:m, eg if I have 50 observations for outcome1 ID 1 but only 30 on outcome2, stata will just 'repeat' the values of the last row of the outcome 2 dataset when I merge it with outcome1 - is there are way to avoid this? Or should they always be nonmissing when I need to do multilevel mixed-effects linear regression (mixed-command)?
Thank you very much in advance - if helpful I can try to produce a data example.
Comment