The only thing you need to change from Mike's code in post #9 is that you need to leave out the line creating the stateabbrev variable, as that variable already exists in your new dataset.
That is:
I do have to say that this is based on the idea that you want to start from the values held in your variable congdist1.
The reason I mention this is because you have two variables, statecdnew and stateabbrevnew that hold conflicting information. In some cases the district number is different, in some cases even the state is different.
Unless you are certain that this is the way it is supposed to be, you might want to trace back if someone bodged an earlier data manipulation step.
That is:
Code:
// working from your most recent data example split congdist1, generate(temp) parse(" ") *gen str stateabbrev = temp1 gen distnum = real(temp2) // -destring- also would work drop temp1 temp2 sort stateabbrev distnum
The reason I mention this is because you have two variables, statecdnew and stateabbrevnew that hold conflicting information. In some cases the district number is different, in some cases even the state is different.
Unless you are certain that this is the way it is supposed to be, you might want to trace back if someone bodged an earlier data manipulation step.
Comment