Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • reshape when sub observation identifier is not unique

    Hi,

    I have a data set as shown below, I need to reshape this from long to wide, ID should be naics code, stub names are year, act, at, let, ni and sale.
    I(naics) j(fyear)


    As the fyear is not the same across naics STATA gives me the error message below.
    "In the long data, variables i() and j() together must uniquely identify the observations."

    Kindly assist to sort this out.

    Thank you

    naics fyear act at lct ni sale
    332 1982 9.282 25.689 22.503 -6.335 36.01
    332 1983 4.049 8.08 8.627 -2.982 18.753
    332 1984 4.076 8.869 7.352 2.206 21.019
    332 1985 3.139 10.549 6.954 3.011 20.507
    332 1986 3.396 14.16 7.251 2.601 19.266
    336 1990 523.502 1728.888 421.777 -75.287 312.309
    336 1991 687.329 1756.262 353.544 156.959 304.091
    336 1992 640.603 1706.454 469.213 -17.81 320.148
    336 1993 440.279 1665.921 401.525 112.249 348.429
    336 1994 574.3 1815.1 382.9 25.2 380.5
    336 1995 NA 2015.8 NA 84.3 407.2
    513 1984 1.192 5.882 1.073 0.065 4.562
    513 1985 1.889 7.04 1.507 0.097 5.867
    513 1986 2.385 6.502 1.18 -0.68 4.671
    513 1987 0.707 5.035 0.997 -1.303 2.721
    513 1988 0.647 4.982 0.613 0.302 2.589
    513 1989 1.442 7.581 0.927 -0.788 3.287
    513 1990 1.247 7.784 1.246 -1.869 3.635
    513 1991 3.163 8.72 1.555 -1.448 4.465
    513 1992 9.861 17.073 1.671 -2.153 4.664
    513 1993 8.637 24.98 3.698 -3.158 6.49
    513 1994 4.298 67.292 11.189 -9.198 17.739
    331 1980 3.651 7.666 1.255 0.09 10.669
    331 1981 6.049 10.461 2.038 1.396 15.744
    331 1982 7.007 12.266 2.405 0.465 15.413
    331 1983 6.77 12.617 2.098 0.738 19.065
    331 1984 7.458 14.441 3.498 0.15 22.047
    331 1985 7.217 13.054 2.529 0.634 23.921
    331 1986 8.027 14.549 3.148 0.427 23.949
    331 1987 9.299 16.858 5.018 -0.147 27.434

  • #2
    In the dataset you are showing, there is no problem, the combination of naics and year uniquely identify observations, and the data reshapes fine:

    Code:
    . isid naics fyear
    
    . reshape wide act at lct ni sale, i(naics) j(fyear)
    (note: j = 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995)
    
    Data                               long   ->   wide
    -----------------------------------------------------------------------------
    Number of obs.                       30   ->       4
    Number of variables                   7   ->      81
    j variable (16 values)            fyear   ->   (dropped)
    xij variables:
                                        act   ->   act1980 act1981 ... act1995
                                         at   ->   at1980 at1981 ... at1995
                                        lct   ->   lct1980 lct1981 ... lct1995
                                         ni   ->   ni1980 ni1981 ... ni1995
                                       sale   ->   sale1980 sale1981 ... sale1995
    -----------------------------------------------------------------------------

    Comment


    • #3
      You should try and figure out where and why in your data the combination of naics/ year does not uniquely identify observations. And then decide what to do about it.

      Comment


      • #4
        As the fyear is not the same across naics STATA gives me the error message below.
        No, that is not what the error message is telling you. It is telling you that for some naics you have two observations with the same year. You have a problem with your data.

        You need to use the duplicates command to find the duplicated naics/year pairs of observations. See the output of help duplicates for information on how to use the command. A good place to start would be
        Code:
        duplicates report naics year
        to see a summary report to understand the scope of the problem, and
        Code:
        duplicates examles naics year
        to see what combinations are duplicated.

        Comment


        • #5
          Reviewing post #1 yet again I see

          stub names are year, act, at, let, ni and sale
          There is no variable called "year" in the sample data shown. Your reshape command should look like the one presented in post #2, with five stub variables.

          Comment

          Working...
          X