I am using Stata 14.1 on Windows 10, and am trying to convert a data in wide format into long format.
After importing the data from Excel file to Stata, I have successfully converted the data from strings to real numbers so that I can run operations on them. Then, I have been able to confirm that the data is indeed in wide format by using the command "reshape wide" which displays 'already wide'.
However, when I try to use the command "reshape long V, i(DES) j(time)" where V is the stem for years variable (named V8, V9,...V53), DES is variable with approximately 9 categories of income sources, and time is just a name for years (no such variable defined), I get the following error-
(note: j = 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53)
variable id does not uniquely identify the observations
Your data are currently wide. You are performing a reshape long. You specified i(DES) and j(time). In the current wide form, variable DES should uniquely
identify the observations. Remember this picture:
long wide
+---------------+ +------------------+
| i j a b | | i a1 a2 b1 b2 |
|---------------| <--- reshape ---> |------------------|
| 1 1 1 2 | | 1 1 3 2 4 |
| 1 2 3 4 | | 2 5 7 6 8 |
| 2 1 5 6 | +------------------+
| 2 2 7 8 |
+---------------+
Type reshape error for a list of the problem observations.
Typing the "reshape error" command shows the following output-
reshape error
(note: j = 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53)
i (DES) indicates the top-level grouping such as subject id.
The data are currently in the wide form; there should be a single
observation per i.
1674 of 1674 observations have duplicate i values:
+--------------------------------------------------------------------+
| DES |
|--------------------------------------------------------------------|
1. | Employer contributions for employee pension and insurance funds 6/ |
2. | Employer contributions for employee pension and insurance funds 6/ | ....
53. | Employer contributions for employee pension and insurance funds 6/ |
54. | Employer contributions for employee pension and insurance funds 6/ |
55. | Employer contributions for government social insurance |
|--------------------------------------------------------------------|
56. | Employer contributions for government social insurance |
57. | Employer contributions for government social insurance |
58. | Employer contributions for government social insurance | ...
I suspect there is an error because the variable DES should have dummy categories, but I am not sure how to proceed further.
Any comments and suggestions would be welcome. Thank you.
After importing the data from Excel file to Stata, I have successfully converted the data from strings to real numbers so that I can run operations on them. Then, I have been able to confirm that the data is indeed in wide format by using the command "reshape wide" which displays 'already wide'.
However, when I try to use the command "reshape long V, i(DES) j(time)" where V is the stem for years variable (named V8, V9,...V53), DES is variable with approximately 9 categories of income sources, and time is just a name for years (no such variable defined), I get the following error-
(note: j = 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53)
variable id does not uniquely identify the observations
Your data are currently wide. You are performing a reshape long. You specified i(DES) and j(time). In the current wide form, variable DES should uniquely
identify the observations. Remember this picture:
long wide
+---------------+ +------------------+
| i j a b | | i a1 a2 b1 b2 |
|---------------| <--- reshape ---> |------------------|
| 1 1 1 2 | | 1 1 3 2 4 |
| 1 2 3 4 | | 2 5 7 6 8 |
| 2 1 5 6 | +------------------+
| 2 2 7 8 |
+---------------+
Type reshape error for a list of the problem observations.
Typing the "reshape error" command shows the following output-
reshape error
(note: j = 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53)
i (DES) indicates the top-level grouping such as subject id.
The data are currently in the wide form; there should be a single
observation per i.
1674 of 1674 observations have duplicate i values:
+--------------------------------------------------------------------+
| DES |
|--------------------------------------------------------------------|
1. | Employer contributions for employee pension and insurance funds 6/ |
2. | Employer contributions for employee pension and insurance funds 6/ | ....
53. | Employer contributions for employee pension and insurance funds 6/ |
54. | Employer contributions for employee pension and insurance funds 6/ |
55. | Employer contributions for government social insurance |
|--------------------------------------------------------------------|
56. | Employer contributions for government social insurance |
57. | Employer contributions for government social insurance |
58. | Employer contributions for government social insurance | ...
I suspect there is an error because the variable DES should have dummy categories, but I am not sure how to proceed further.
Any comments and suggestions would be welcome. Thank you.
Comment