Dear everyone,
I am currently new at working with Stata, and currently working on my thesis with which I use panel data.
In my data I would like to measure how life events of people and their peers are related to an outcome on a wellbeing scale. My problem is that I have several observations per year.
Because of this, Stata is not able to recognize my data as paneldata. I've read that I can transform my data from long to wide. However, since I have a lot of years, I would like to avoid this. I can have up to 15 different observations per year, and I also included up to 15 different years, so that would make my data quite inconvenient to work with.
I read that it is maybe also possible to create a new variable, I would like to have some advice on this.
To make it more clear:
For every 1 ID, I can have observations for over 15 years. Within one year, I have 3 different persons that can have a possible life event: self/partner/family member. There are 5 different type of life events that can occur.
I was thinking about taking out the data in another file, generating 15 dummy variables, where I combine every 3x5 possible situation (person+life event1, person + life event 2, ....., partner + life event 1, partner + life event 2... ) and merging it back into the data again, combining it for ID and year, so that I would have all the possible observations on 1 row.
Thank you for your advice!!
To get an idea, a scope of the data:
input long ID int year long(event_who2 event_what2) float CESDtot
61100002 1996 2 4 65
61100002 1996 2 2 65
61100002 2002 2 4 61
61100002 2002 2 2 61
61100002 2005 3 4 70
61100002 2008 . 0 73
61100002 2012 3 18 74
61100003 1996 2 2 66
61100003 1999 . 0 64
61100003 2002 4 4 62
61100003 2005 . 0 65
61100003 2008 2 4 63
61100003 2008 2 2 63
61100003 2012 4 4 60
61100005 1996 2 2 47
61100007 1996 2 4 67
61100007 1999 . 0 76
61100007 2002 4 18 75
61100007 2005 . 0 76
61100009 1999 . 0 74
61100009 2002 2 2 68
61100009 2005 . 0 73
61100009 2008 . 0 73
61100009 2012 2 2 68
61100009 2012 4 4 68
61100010 1996 . 0 72
61100010 1999 . 0 65
61100010 2005 3 4 65
61100010 2005 2 2 65
61100010 2005 3 18 65
61100010 2005 2 17 65
61100010 2005 4 18 65
61100011 2005 . 0 61
61100011 2008 . 0 60
61100011 2012 . 0 63
61100012 1999 . 0 68
61100012 2005 4 4 75
61100013 1996 . 0 46
61100013 1999 . 0 46
61100013 2005 . 0 48
61100014 1999 . 0 68
61100014 2005 3 19 62
61100014 2008 . 0 70
end
label values event_who2 event_who2
label def event_who2 2 "Member of family", modify
label def event_who2 3 "Spouse or partner", modify
label def event_who2 4 "Youself", modify
label values event_what2 event_what2
label def event_what2 2 "Death", modify
label def event_what2 4 "Hospitalisation", modify
label def event_what2 17 "Separation or divorce", modify
label def event_what2 18 "Termination of activity or retirement", modify
label def event_what2 19 "Unemployment", modify
[/CODE]
I am currently new at working with Stata, and currently working on my thesis with which I use panel data.
In my data I would like to measure how life events of people and their peers are related to an outcome on a wellbeing scale. My problem is that I have several observations per year.
Because of this, Stata is not able to recognize my data as paneldata. I've read that I can transform my data from long to wide. However, since I have a lot of years, I would like to avoid this. I can have up to 15 different observations per year, and I also included up to 15 different years, so that would make my data quite inconvenient to work with.
I read that it is maybe also possible to create a new variable, I would like to have some advice on this.
To make it more clear:
For every 1 ID, I can have observations for over 15 years. Within one year, I have 3 different persons that can have a possible life event: self/partner/family member. There are 5 different type of life events that can occur.
I was thinking about taking out the data in another file, generating 15 dummy variables, where I combine every 3x5 possible situation (person+life event1, person + life event 2, ....., partner + life event 1, partner + life event 2... ) and merging it back into the data again, combining it for ID and year, so that I would have all the possible observations on 1 row.
Thank you for your advice!!
To get an idea, a scope of the data:
input long ID int year long(event_who2 event_what2) float CESDtot
61100002 1996 2 4 65
61100002 1996 2 2 65
61100002 2002 2 4 61
61100002 2002 2 2 61
61100002 2005 3 4 70
61100002 2008 . 0 73
61100002 2012 3 18 74
61100003 1996 2 2 66
61100003 1999 . 0 64
61100003 2002 4 4 62
61100003 2005 . 0 65
61100003 2008 2 4 63
61100003 2008 2 2 63
61100003 2012 4 4 60
61100005 1996 2 2 47
61100007 1996 2 4 67
61100007 1999 . 0 76
61100007 2002 4 18 75
61100007 2005 . 0 76
61100009 1999 . 0 74
61100009 2002 2 2 68
61100009 2005 . 0 73
61100009 2008 . 0 73
61100009 2012 2 2 68
61100009 2012 4 4 68
61100010 1996 . 0 72
61100010 1999 . 0 65
61100010 2005 3 4 65
61100010 2005 2 2 65
61100010 2005 3 18 65
61100010 2005 2 17 65
61100010 2005 4 18 65
61100011 2005 . 0 61
61100011 2008 . 0 60
61100011 2012 . 0 63
61100012 1999 . 0 68
61100012 2005 4 4 75
61100013 1996 . 0 46
61100013 1999 . 0 46
61100013 2005 . 0 48
61100014 1999 . 0 68
61100014 2005 3 19 62
61100014 2008 . 0 70
end
label values event_who2 event_who2
label def event_who2 2 "Member of family", modify
label def event_who2 3 "Spouse or partner", modify
label def event_who2 4 "Youself", modify
label values event_what2 event_what2
label def event_what2 2 "Death", modify
label def event_what2 4 "Hospitalisation", modify
label def event_what2 17 "Separation or divorce", modify
label def event_what2 18 "Termination of activity or retirement", modify
label def event_what2 19 "Unemployment", modify
[/CODE]
Comment