Hi all,
I have been struggling with reshaping from long to wide one of my datasets. I have 3 variables and 8,076,812 observations. I am relatively new to StataSE as well. I am using the maximum allowed amount of variables. I also optimized the data variable's type of storage without forcing.
Here is a partial picture of the dataset that is currently in long format. I have duplicates of my dupersid variable, which is my primary id for participants. I would like to reshape the dataset so my narcANALS variables will be spread wide along the dupersids (with then, no duplicate dupersids). It is the narcANALS that is causing the duplicates of dupersid. NarcANALS is a 0 or 1 for a certain medication. Participants are receiving scripts for multiple medications.
I used reshape for other datasets and it has worked for me. However, it is not working for the current dataset. I am receiving the error message under my code below. Do I need to split up my dataset?
reshape wide narcANALS, i(dupersid) j(newid)
variable newid takes on too many values
r(134);
+-----------------------------+
| dupersid narcAN~S newid |
|-----------------------------|
1. | 20004103 0 1 |
2. | 20005101 0 2 |
3. | 20005101 0 2 |
4. | 20005101 0 2 |
5. | 20005101 0 2 |
|-----------------------------|
6. | 20005101 0 2 |
7. | 20005101 0 2 |
8. | 20005101 0 2 |
9. | 20005101 0 2 |
10. | 20005101 0 2 |
|-----------------------------|
11. | 20005101 0 2 |
12. | 20005101 0 2 |
13. | 20005101 0 2 |
14. | 20005101 0 2 |
15. | 20005101 0 2 |
I have been struggling with reshaping from long to wide one of my datasets. I have 3 variables and 8,076,812 observations. I am relatively new to StataSE as well. I am using the maximum allowed amount of variables. I also optimized the data variable's type of storage without forcing.
Here is a partial picture of the dataset that is currently in long format. I have duplicates of my dupersid variable, which is my primary id for participants. I would like to reshape the dataset so my narcANALS variables will be spread wide along the dupersids (with then, no duplicate dupersids). It is the narcANALS that is causing the duplicates of dupersid. NarcANALS is a 0 or 1 for a certain medication. Participants are receiving scripts for multiple medications.
I used reshape for other datasets and it has worked for me. However, it is not working for the current dataset. I am receiving the error message under my code below. Do I need to split up my dataset?
reshape wide narcANALS, i(dupersid) j(newid)
variable newid takes on too many values
r(134);
+-----------------------------+
| dupersid narcAN~S newid |
|-----------------------------|
1. | 20004103 0 1 |
2. | 20005101 0 2 |
3. | 20005101 0 2 |
4. | 20005101 0 2 |
5. | 20005101 0 2 |
|-----------------------------|
6. | 20005101 0 2 |
7. | 20005101 0 2 |
8. | 20005101 0 2 |
9. | 20005101 0 2 |
10. | 20005101 0 2 |
|-----------------------------|
11. | 20005101 0 2 |
12. | 20005101 0 2 |
13. | 20005101 0 2 |
14. | 20005101 0 2 |
15. | 20005101 0 2 |
Comment