Hi all,
I have a panel data set that is fairly large (70 quarters now) and I have to add new quarters to this file as they come out. I have created a panel id variable by using the following code,
where cid is a unique identifier in the data set. My issue now is that as far as I have figured out the only way to continue matching the panel id is to drop the cidnew variable and run the previous code again after appending the new quarter data. The only issue I have with that is that it is a very slow process given the file size, and I am hoping to expedite the process. I have considered trying combine replace with some if statements to match the panel id up with the new cid's each quarter, but the issue with that is there are not a consistent number of cid's in every quarter, meaning that some leave and some join in a given quarter, and can always come back in a later quarter if gone in the newest quarter. So I am wondering if anyone has any ideas on how I could possibly just add to the panel id variable (cidnew), without dropping cidnew and rerunning the same code. Also the reason I am using cidnew and not the cid for the panel id, is that when declaring the data set to be panel it can't take cid as the panel id variable since it is an alpha-numeric string, that if converted to all numeric by simply replacing the letters with numbers, it is no longer a unique identifier.
Thanks in advance for any help.
I have a panel data set that is fairly large (70 quarters now) and I have to add new quarters to this file as they come out. I have created a panel id variable by using the following code,
Code:
egen double cidnew = group(cid)
Thanks in advance for any help.

Comment