Hello everyone, this is my first post on here so my apologies if the format is not up to the usual standards.
I have a data set covering a time span between 1998 and 2016 with about 65.000 observations that may be divided into 15 groups (Dutch political parties) The aspects that are relevant for this question are listed below:
What I want to do is create a variable that is the sum of the variable 'success' per 'partyid'(numeric reflection of sponsorparty).
I manually added the variable 'cumsuccess' in the table above manually to better illustrate what I mean. As you can see, the variable 'success' is just 0's and 1's to illustrate rejection (0) or success (1). The numbers in cumsuccess should reflect the cumulative sum of success per party. Which is why 'CDA' is at 3 and other parties are not.
I hope this makes sense. Please tell me if I could make some things more clear. Thank you!
I have a data set covering a time span between 1998 and 2016 with about 65.000 observations that may be divided into 15 groups (Dutch political parties) The aspects that are relevant for this question are listed below:
Code:
clear input byte order str9 date str8 time byte partyid str10 sponsor byte(success cumsuccess) 1 "26aug1998" "08:11:41" 5 "CDA" 0 0 2 "26aug1998" "08:12:12" 10 "GroenLinks" 0 0 3 "26aug1998" "08:18:51" 5 "CDA" 0 0 4 "26aug1998" "08:46:35" 10 "GroenLinks" 0 0 5 "26aug1998" "09:00:21" 10 "GroenLinks" 0 0 6 "08sep1998" "07:41:39" 5 "CDA" 1 1 7 "08sep1998" "08:50:57" 5 "CDA" 0 1 8 "08sep1998" "09:05:53" 24 "SP" 0 0 9 "17sep1998" "07:39:19" 10 "GroenLinks" 0 0 10 "17sep1998" "07:44:20" 25 "VVD" 0 0 11 "17sep1998" "07:59:09" 7 "D66" 0 0 12 "17sep1998" "08:03:12" 20 "PvdA" 0 0 13 "17sep1998" "08:12:32" 10 "GroenLinks" 0 0 14 "17sep1998" "08:12:55" 23 "SGP" 1 1 15 "17sep1998" "08:14:15" 24 "SP" 0 0 16 "17sep1998" "08:16:44" 5 "CDA" 0 1 17 "17sep1998" "08:21:38" 20 "PvdA" 0 0 18 "17sep1998" "08:22:40" 5 "CDA" 1 2 19 "17sep1998" "08:24:20" 5 "CDA" 0 2 20 "17sep1998" "08:27:57" 20 "PvdA" 1 1 21 "17sep1998" "08:42:09" 10 "GroenLinks" 0 0 22 "17sep1998" "08:44:11" 7 "D66" 0 0 23 "17sep1998" "08:54:14" 20 "PvdA" 0 1 24 "17sep1998" "08:55:03" 10 "GroenLinks" 0 0 25 "17sep1998" "09:02:38" 24 "SP" 0 0 26 "17sep1998" "09:08:47" 7 "D66" 0 0 27 "22sep1998" "07:25:52" 24 "SP" 1 0 28 "22sep1998" "09:37:04" 10 "GroenLinks" 1 1 29 "29sep1998" "07:15:41" 20 "PvdA" 0 1 30 "29sep1998" "07:22:35" 5 "CDA" 1 3 end drop cumsuccess sort partyid order bys partyid:gen cumsuccess=sum(success) sort order
I manually added the variable 'cumsuccess' in the table above manually to better illustrate what I mean. As you can see, the variable 'success' is just 0's and 1's to illustrate rejection (0) or success (1). The numbers in cumsuccess should reflect the cumulative sum of success per party. Which is why 'CDA' is at 3 and other parties are not.
I hope this makes sense. Please tell me if I could make some things more clear. Thank you!
Comment