Hi, I'm sure the title of my question is a bit confusing, so let me explain briefly.
I have a panel data structure with some companies in the many different years.
Within this panel data there may be duplicates with respect to the columns "gvkey" (which is a company identifyer), "year" and "month". However, I do not want to drop these duplicates directly as they differ in a few other columns, namely "NUMEST", "NUMUP", "CURCODE".
Nevertheless, for further analysis I need a structure in which I have no duplicates with respect to to "gvkey", "year" and "month".
So what I want now is the following: I want to keep the data "NUMEST", "NUMUP" and "CURCODE" and not throw them out. So if there is a duplicate (in the sense described above), then I want to put these three variables in the row above, and at the very end of the columns.
Example (Starting Situation):
Line 1:
gvkey year month NUMEST NUMUP CURCODE
Row X: Apple 2018 06 5 3 EUR
Row X+1: Apple 2018 06 7 2 GBP
Example (Final Solution):
gvkey year month NUMEST NUMUP CURCODE NUMEST2 NUMUP2 CURCODE2
Row X: Apple 2018 06 5 3 EUR 7 2 GBP
My data looks like this:
Since this is the first time I use dataex in this forum, I hope I did everything right.
I would be very grateful for any suggestions on how to solve this issue, and please also let me know when you have an easier way to solve this.
Thank you & have a nice day!
I have a panel data structure with some companies in the many different years.
Within this panel data there may be duplicates with respect to the columns "gvkey" (which is a company identifyer), "year" and "month". However, I do not want to drop these duplicates directly as they differ in a few other columns, namely "NUMEST", "NUMUP", "CURCODE".
Nevertheless, for further analysis I need a structure in which I have no duplicates with respect to to "gvkey", "year" and "month".
So what I want now is the following: I want to keep the data "NUMEST", "NUMUP" and "CURCODE" and not throw them out. So if there is a duplicate (in the sense described above), then I want to put these three variables in the row above, and at the very end of the columns.
Example (Starting Situation):
Line 1:
gvkey year month NUMEST NUMUP CURCODE
Row X: Apple 2018 06 5 3 EUR
Row X+1: Apple 2018 06 7 2 GBP
Example (Final Solution):
gvkey year month NUMEST NUMUP CURCODE NUMEST2 NUMUP2 CURCODE2
Row X: Apple 2018 06 5 3 EUR 7 2 GBP
My data looks like this:
Code:
* Example generated by -dataex-. For more info, type help dataex clear input long gvkey float(year month NUMEST) byte NUMUP str3 CURCODE 1004 1989 3 10 0 "USD" 1004 1989 3 13 0 "USD" 1004 1991 3 7 0 "USD" 1004 1992 3 9 0 "USD" 1004 1993 3 8 0 "USD" 1004 1994 3 4 0 "USD" 1004 1995 3 4 0 "USD" 1004 1995 3 2 0 "USD" 1004 1997 3 6 1 "USD" 1004 1998 3 7 0 "USD" 1004 1999 3 8 3 "USD" 1004 2000 3 7 0 "USD" 1004 2001 3 5 0 "USD" 1004 2001 3 4 1 "USD" 1004 2003 3 2 0 "USD" 1004 2004 3 1 0 "USD" 1004 2005 3 2 2 "USD" 1004 2006 3 5 0 "USD" 1004 2007 3 6 0 "USD" 1004 2008 3 9 3 "USD" 1004 2009 3 6 0 "USD" 1004 2010 3 6 0 "USD" 1004 2011 3 8 6 "USD" 1004 2012 3 5 0 "USD" 1004 2013 3 7 0 "USD" 1004 2014 3 6 0 "USD" 1004 2015 3 3 0 "USD" 1004 2016 3 3 0 "USD" 1004 2017 3 4 0 "USD" 1004 2018 3 5 0 "USD" 1004 2019 3 5 0 "USD" 1009 1989 8 1 0 "USD" 1009 1990 8 1 0 "USD" 1009 1991 8 1 0 "USD" 1009 1992 8 1 0 "USD" 1009 1993 8 1 0 "USD" 1009 1994 8 1 0 "USD" 1009 1995 8 1 0 "USD" 1011 1993 10 1 0 "USD" 1013 1988 8 9 1 "USD" 1013 1988 8 8 2 "USD" 1013 1988 8 5 0 "USD" 1013 1991 8 6 0 "USD" 1013 1992 8 7 0 "USD" 1013 1993 8 8 0 "USD" 1013 1994 8 8 1 "USD" 1013 1995 8 14 2 "USD" 1013 1996 8 17 2 "USD" 1013 1997 8 16 0 "USD" 1013 1998 8 19 9 "USD" 1013 1999 8 24 15 "USD" 1013 2000 8 23 0 "USD" 1013 2001 8 22 0 "USD" 1013 2002 8 17 0 "USD" 1013 2003 8 13 0 "USD" 1013 2004 8 12 3 "USD" 1013 2005 8 13 2 "USD" 1013 2005 8 14 1 "USD" 1013 2005 8 18 2 "USD" 1013 2008 8 8 0 "USD" 1013 2009 8 13 10 "USD" 1013 2010 7 13 1 "USD" 1017 1988 12 2 0 "USD" end
Since this is the first time I use dataex in this forum, I hope I did everything right.
I would be very grateful for any suggestions on how to solve this issue, and please also let me know when you have an easier way to solve this.
Thank you & have a nice day!
Comment