problem multiple observations panel data

Josephine Bijl

Join Date: Apr 2017

Posts: 5
#1

problem multiple observations panel data

27 Apr 2017, 08:38

Dear everyone,

I am currently new at working with Stata, and currently working on my thesis with which I use panel data.

In my data I would like to measure how life events of people and their peers are related to an outcome on a wellbeing scale. My problem is that I have several observations per year.
Because of this, Stata is not able to recognize my data as paneldata. I've read that I can transform my data from long to wide. However, since I have a lot of years, I would like to avoid this. I can have up to 15 different observations per year, and I also included up to 15 different years, so that would make my data quite inconvenient to work with.

I read that it is maybe also possible to create a new variable, I would like to have some advice on this.

To make it more clear:
For every 1 ID, I can have observations for over 15 years. Within one year, I have 3 different persons that can have a possible life event: self/partner/family member. There are 5 different type of life events that can occur.

I was thinking about taking out the data in another file, generating 15 dummy variables, where I combine every 3x5 possible situation (person+life event1, person + life event 2, ....., partner + life event 1, partner + life event 2... ) and merging it back into the data again, combining it for ID and year, so that I would have all the possible observations on 1 row.

Thank you for your advice!!

To get an idea, a scope of the data:

input long ID int year long(event_who2 event_what2) float CESDtot
61100002 1996 2 4 65
61100002 1996 2 2 65
61100002 2002 2 4 61
61100002 2002 2 2 61
61100002 2005 3 4 70
61100002 2008 . 0 73
61100002 2012 3 18 74
61100003 1996 2 2 66
61100003 1999 . 0 64
61100003 2002 4 4 62
61100003 2005 . 0 65
61100003 2008 2 4 63
61100003 2008 2 2 63
61100003 2012 4 4 60
61100005 1996 2 2 47
61100007 1996 2 4 67
61100007 1999 . 0 76
61100007 2002 4 18 75
61100007 2005 . 0 76
61100009 1999 . 0 74
61100009 2002 2 2 68
61100009 2005 . 0 73
61100009 2008 . 0 73
61100009 2012 2 2 68
61100009 2012 4 4 68
61100010 1996 . 0 72
61100010 1999 . 0 65
61100010 2005 3 4 65
61100010 2005 2 2 65
61100010 2005 3 18 65
61100010 2005 2 17 65
61100010 2005 4 18 65
61100011 2005 . 0 61
61100011 2008 . 0 60
61100011 2012 . 0 63
61100012 1999 . 0 68
61100012 2005 4 4 75
61100013 1996 . 0 46
61100013 1999 . 0 46
61100013 2005 . 0 48
61100014 1999 . 0 68
61100014 2005 3 19 62
61100014 2008 . 0 70

end
label values event_who2 event_who2
label def event_who2 2 "Member of family", modify
label def event_who2 3 "Spouse or partner", modify
label def event_who2 4 "Youself", modify
label values event_what2 event_what2
label def event_what2 2 "Death", modify
label def event_what2 4 "Hospitalisation", modify
label def event_what2 17 "Separation or divorce", modify
label def event_what2 18 "Termination of activity or retirement", modify
label def event_what2 19 "Unemployment", modify
[/CODE]
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#2

27 Apr 2017, 09:16

Josephine:
welcome to the list.
If the order of the observations is irrelevant, you can simply -xtset- your data without a -timevar-, that is:

Code:

xtset ID

See -help xtset- and related entry in Stata .pdf manual for more details.

Kind regards,
Carlo
(Stata 19.0)
Comment
Josephine Bijl

Join Date: Apr 2017

Posts: 5
#3

27 Apr 2017, 09:36

Dear Carlo,

Thank you for your reply.
Unfortunately, I also want to take into account how my well-being variable evolves over time after a shock, so I would need to take the order into account.

Best,

Josephine
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#4

27 Apr 2017, 09:49

Josephine:
another recipe would imply -collapse-.
If that were the option, you should relabel your events (i.e. hospitalization + death for the same ID) and calculate a mean value for CESDtot per ID per year.

Kind regards,
Carlo
(Stata 19.0)
Comment
Josephine Bijl

Join Date: Apr 2017

Posts: 5
#5

27 Apr 2017, 10:32

Thank you for your help, I was also thinking about using collapse (I just created the dummy variables for each unique combination), but shouldn't that mean that I will lose a lot of other information such as gender etc?
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#6

27 Apr 2017, 10:40

Josephine:
not necessarily so, as you can include -gender- between brackets after -by-.

Kind regards,
Carlo
(Stata 19.0)
Comment
Josephine Bijl

Join Date: Apr 2017

Posts: 5
#7

27 Apr 2017, 10:44

thank you Carlo, I tried to do it now, but for some reason it says that my variable is missing.

I created the dummy's and then ran the command:

collapse (sum) < new vars> by ID year CESDtot gender

Sorry for all my questions, I hope you can help me with this?
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4466
#8

27 Apr 2017, 11:01

please, as requested in the FAQ, post exactly what you typed and exactly what Stata responded; assuming that what you show in #7 is approximately correct (it can't be fully correct), you need to follow the instructions in the help file and place your "by" variables inside parentheses
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#9

27 Apr 2017, 12:05

Josephine:
I do share Rich's point.
What you typed, as it is, cannot be completely correct.
Being positively in replying with such a scant handful of inputs requires a huge (and, in all likelihood, unuseful) guess-work.

Kind regards,
Carlo
(Stata 19.0)
Comment
Josephine Bijl

Join Date: Apr 2017

Posts: 5
#10

28 Apr 2017, 02:55

I am sorry that I did not put the code on the post clearly. I will definitely do so in the future. I wanted to let you know that I was now able to use the collapse command.
Thank you so much for your help!
Comment

Announcement

problem multiple observations panel data

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment