Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reshaping wide error values of variable date not unique within id

    Hi,
    I`m fairly new to Stata and I know this question is previously answered however I could not make it work for me.

    This is the type of data I`m working with:

    id date r i
    1 2002_11 1.307071 .87494
    1 2002_12 1.403008 1.019082
    1 2003_01 1.570926 1.152942
    1 2003_02 1.894784 1.307366
    1 2003_03 1.798847 1.235295

    This is the command i`m using:

    reshape wide r i, i(id) j(date) str

    This is the error:
    values of variable date not unique within id
    Your data are currently long. You are performing a reshape wide. You specified i(id) and j(date). There are observations
    within i(id) with the same value of j(date). In the long data, variables i() and j() together must uniquely identify the
    observations.

    What do I need to change or add ?

    Thanks for all replies.

    Best,
    Emerson


  • #2
    Your data are not compatible with this command. There is some value of id for which some value of date appears more than once. To identify the offending observations:

    Code:
    duplicates tag id date, gen(flag)
    browse if flag
    Then you have to decide what to do about it. There are a few possibilities:

    1. There is some other variable that, when used along with id and date, uniquely identifies variables. For example, if the things you are analyzing are firms, there might be a variable identifying branches or subdivisions within them, and there is a unique observation for each firm-branch/division-date combination. In that case, juts add the branch/division id to the -i()- option of your -reshape- command and things will work out fine.

    2. These duplicate observations should not exist. Something went wrong with the data management up to this point that led to a corrupted data set. The solution is then to go back over the data management and find where it went wrong, fix it, and re-generate the data set.

    3. These duplicate observations are OK: we did actually observe the same id more than once on the same date, and we don't want to pick out just one of those or combine them into a single observation--we want to preserve all the originals. And there is no subdivision/branch type variable that distinguishes them. It's just multiple replications of the observations of variables r and i on the same id on the same date. In that case, you need to create a unique identifier. The easiest way to do that is:

    Code:
    gen long obs_no = _n
    reshape wide r i, i(obs_no) j(date) string
    But only do this if it's really true that these duplicate observations are all legitimate and there is no other variable in the data set that distinguishes them in a systematic way.

    In the future, when showing data examples, please use the -dataex- command to do so, as I have here. If you are running version 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Also, the example data you show does not actually exhibit the problem for which you require help. That code runs perfectly well with the example data. Had this been a more complicated problem, it would have been impossible to troubleshoot based on this example. So, in the future, when posting example data, run the code you are having trouble with on the example data and verify that it reproduces the problem you need help with.

    Comment


    • #3
      Hi Clyde.
      Creating a unique identifier solved the problem. I`ll make sure in the future I post using dataex for more clarity and also post all the data. Thank you for the detailed response.


      Comment

      Working...
      X