Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reshaping long to wide data with multiple unique identifier in a data set

    Dear all,

    I faced a problem while using reshape command for my data.

    reshape wide time_from time_to, i(pid) j(hhid)
    (note: j = 1000101)
    values of variable hhid not unique within pid
    Your data are currently long. You are performing a reshape wide. You specified i(pid)
    and j(hhid). There are observations within i(pid) with the same value of j(hhid). In
    the long data, variables i() and j() together must uniquely identify the observations.

    long wide
    +---------------+ +------------------+
    | i j a b | | i a1 a2 b1 b2 |
    |---------------| <--- reshape ---> |------------------|
    | 1 1 1 2 | | 1 1 3 2 4 |
    | 1 2 3 4 | | 2 5 7 6 8 |
    | 2 1 5 6 | +------------------+
    | 2 2 7 8 |
    +---------------+
    Type reshape error for a list of the problem variables.
    r(9);


    My dataset looks something like this

    time_from time_to hhid pid
    06:30 07:00 1000101 1000101002
    17:00 18:30 1000101 1000101002
    18:30 19:00 1000101 1000101003
    19:00 20:00 1000101 1000101003
    21:30 04:00 1000101 1000101004
    09:00 10:00 1000101 1000101004
    04:00 06:30 1000102 1000102001
    16:00 16:30 1000102 1000102001
    21:00 21:30 1000103 1000103001
    06:00 06:30 1000103 1000103001
    21:30 04:00 1000103 1000103002
    18:30 19:00 1000103 1000103002

    here, (pid) is the unique identifier in a household(hhid) but its observation is repeated because the same person's data is collected over time in a day. when I used the command

    reshape wide time_from time_to, i(pid) j(hhid)

    it says "values of variable hhid not unique within pid"

    how can I generate a unique id for such a dataset? or how can I reshape such dataset from long to wide?

    ThankYou all in advance for the help.


  • #2
    Inspecting your data tableau,l it appears that you have multiple observations for the same hhid and pid and that these are distinguished by the time variables. Under these circumstances it makes no sense to try to reshape the time variables into a wide layout. For example, for hhid 1000101 pid 1000101103, there are two possible values for time_from and two possible values for time_to. But they can't both be in the single variables time_from1000101103 and time_to1000101103. So I think you have something else in mind.

    If you show what you would like your results to look like, I imagine somebody can help you write code for it.

    When posting back, please repost your example data using the -dataex- command, so that it will be easy for whoever helps you to work with. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Finally, before you and somebody else invest more time in this, have you thought through carefully why you want to do this in the first place? Wide layouts usually workout poorly for data management and analysis in Stata. Most Stata commands work best, or only at all, with long layout data. So unless you are quite sure that you are going to be doing something for which Stata requires a wide layout (and there are only a handful of such things), probably best not to sink any effort into puzzling this out.

    Comment


    • #3
      Hi Clyde, thank you for the response.

      The sample dataset looks something like this.

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input str3 activity_no str5(time_from time_to) str1(multiact simultact majoract) str3 activity_code str1 placeact str2 paidact str1 enterprise str7 HHID str10 PID
      "001" "04:00" "06:00" "2" ""  "1" "911" "1" "01" ""  "1020201" "1020201001"
      "002" "06:00" "06:30" "1" "1" "1" "931" "1" "01" ""  "1020201" "1020201001"
      "003" "06:00" "06:30" ""  ""  "2" "922" "1" "01" ""  "1020201" "1020201001"
      "004" "06:00" "06:30" ""  ""  "2" "741" "1" "01" ""  "1020201" "1020201001"
      "005" "06:30" "07:00" "1" "1" "2" "921" "1" "01" ""  "1020201" "1020201001"
      "006" "06:30" "07:00" ""  ""  "1" "842" "1" "01" ""  "1020201" "1020201001"
      "007" "07:00" "08:00" "2" ""  "1" "415" "1" "02" ""  "1020201" "1020201001"
      "008" "08:00" "08:30" "2" ""  "1" "181" "2" "15" "3" "1020201" "1020201001"
      "009" "08:30" "13:00" "2" ""  "1" "110" "2" "15" "3" "1020201" "1020201001"
      "010" "13:00" "14:00" "2" ""  "1" "142" "2" "15" "3" "1020201" "1020201001"
      "011" "14:00" "18:00" "2" ""  "1" "110" "2" "15" "3" "1020201" "1020201001"
      "012" "18:00" "19:00" "2" ""  "1" "181" "2" "15" "3" "1020201" "1020201001"
      "013" "19:00" "19:30" "1" "1" "1" "842" "1" "01" ""  "1020201" "1020201001"
      "014" "19:00" "19:30" ""  ""  "2" "922" "1" "01" ""  "1020201" "1020201001"
      "015" "19:00" "19:30" ""  ""  "2" "415" "1" "02" ""  "1020201" "1020201001"
      "016" "19:30" "20:30" "1" "1" "1" "413" "1" "02" ""  "1020201" "1020201001"
      "017" "19:30" "20:30" ""  ""  "2" "411" "1" "02" ""  "1020201" "1020201001"
      "018" "20:30" "21:00" "1" "1" "1" "921" "1" "01" ""  "1020201" "1020201001"
      "019" "20:30" "21:00" ""  ""  "2" "711" "1" "01" ""  "1020201" "1020201001"
      "020" "21:00" "21:30" "2" ""  "1" "842" "1" "01" ""  "1020201" "1020201001"
      "021" "21:30" "04:00" "2" ""  "1" "911" "1" "01" ""  "1020201" "1020201001"
      "001" "04:00" "05:00" "2" ""  "1" "911" "1" "01" ""  "1020201" "1020201002"
      "002" "05:00" "05:30" "1" "1" "1" "931" "1" "01" ""  "1020201" "1020201002"
      "003" "05:00" "05:30" ""  ""  "2" "311" "1" "03" ""  "1020201" "1020201002"
      "004" "05:00" "05:30" ""  ""  "2" "922" "1" "01" ""  "1020201" "1020201002"
      "005" "05:30" "06:00" "2" ""  "1" "411" "1" "02" ""  "1020201" "1020201002"
      "006" "06:00" "07:00" "1" "1" "1" "311" "1" "03" ""  "1020201" "1020201002"
      "007" "06:00" "07:00" ""  ""  "2" "312" "1" "03" ""  "1020201" "1020201002"
      "008" "07:00" "07:30" "1" "1" "1" "921" "1" "01" ""  "1020201" "1020201002"
      "009" "07:00" "07:30" ""  ""  "2" "842" "1" "01" ""  "1020201" "1020201002"
      "010" "07:30" "08:00" "1" "2" "1" "321" "1" "03" ""  "1020201" "1020201002"
      "011" "07:30" "08:00" ""  ""  "2" "322" "1" "03" ""  "1020201" "1020201002"
      "012" "07:30" "08:00" ""  ""  "2" "313" "1" "03" ""  "1020201" "1020201002"
      "013" "08:00" "12:00" "2" ""  "1" "139" "2" "16" "1" "1020201" "1020201002"
      "014" "12:00" "12:30" "2" ""  "1" "142" "2" "16" "1" "1020201" "1020201002"
      "015" "12:30" "14:30" "2" ""  "1" "139" "2" "16" "1" "1020201" "1020201002"
      "016" "14:30" "15:00" "1" "1" "1" "921" "1" "01" ""  "1020201" "1020201002"
      "017" "14:30" "15:00" ""  ""  "2" "842" "1" "01" ""  "1020201" "1020201002"
      "018" "15:00" "16:00" "2" ""  "1" "911" "1" "01" ""  "1020201" "1020201002"
      "019" "16:00" "17:30" "2" ""  "1" "413" "1" "02" ""  "1020201" "1020201002"
      "020" "17:30" "18:30" "2" ""  "1" "311" "1" "03" ""  "1020201" "1020201002"
      "021" "18:30" "19:00" "2" ""  "1" "415" "1" "02" ""  "1020201" "1020201002"
      "022" "19:00" "20:00" "2" ""  "1" "842" "1" "01" ""  "1020201" "1020201002"
      "023" "20:00" "20:30" "2" ""  "1" "311" "1" "03" ""  "1020201" "1020201002"
      "024" "20:30" "21:00" "1" "2" "1" "921" "1" "01" ""  "1020201" "1020201002"
      "025" "20:30" "21:00" ""  ""  "2" "312" "1" "03" ""  "1020201" "1020201002"
      "026" "20:30" "21:00" ""  ""  "2" "313" "1" "03" ""  "1020201" "1020201002"
      "027" "21:00" "21:30" "2" ""  "1" "842" "1" "01" ""  "1020201" "1020201002"
      "028" "21:30" "04:00" "2" ""  "1" "911" "1" "01" ""  "1020201" "1020201002"
      "001" "04:00" "05:00" "2" ""  "1" "911" "1" "01" ""  "1020202" "1020202001"
      "002" "05:00" "05:30" "2" ""  "1" "931" "1" "01" ""  "1020202" "1020202001"
      "003" "05:30" "06:00" "2" ""  "1" "211" "2" "04" ""  "1020202" "1020202001"
      "004" "06:00" "06:30" "2" ""  "1" "741" "1" "01" ""  "1020202" "1020202001"
      "005" "06:30" "07:00" "2" ""  "1" "842" "1" "01" ""  "1020202" "1020202001"
      "006" "07:00" "07:30" "1" "2" "2" "181" "2" "13" "1" "1020202" "1020202001"
      "007" "07:00" "07:30" ""  ""  "1" "129" "2" "13" "1" "1020202" "1020202001"
      "008" "07:30" "08:00" "1" "1" "1" "711" "1" "01" ""  "1020202" "1020202001"
      "009" "07:30" "08:00" ""  ""  "2" "411" "1" "02" ""  "1020202" "1020202001"
      "010" "08:00" "08:30" "2" ""  "1" "711" "1" "01" ""  "1020202" "1020202001"
      "011" "08:30" "09:00" "1" "2" "1" "921" "1" "01" ""  "1020202" "1020202001"
      "012" "08:30" "09:00" ""  ""  "2" "922" "1" "01" ""  "1020202" "1020202001"
      "013" "09:00" "12:00" "2" ""  "1" "211" "2" "04" ""  "1020202" "1020202001"
      "014" "12:00" "13:00" "1" "1" "1" "842" "1" "01" ""  "1020202" "1020202001"
      "015" "12:00" "13:00" ""  ""  "2" "931" "1" "01" ""  "1020202" "1020202001"
      "016" "13:00" "15:00" "2" ""  "1" "919" "1" "01" ""  "1020202" "1020202001"
      "017" "15:00" "16:00" "1" "2" "1" "380" "2" "03" ""  "1020202" "1020202001"
      "018" "15:00" "16:00" ""  ""  "2" "371" "3" "03" ""  "1020202" "1020202001"
      "019" "16:00" "17:00" "1" "1" "1" "842" "1" "01" ""  "1020202" "1020202001"
      "020" "16:00" "17:00" ""  ""  "2" "711" "1" "01" ""  "1020202" "1020202001"
      "021" "16:00" "17:00" ""  ""  "2" "922" "1" "01" ""  "1020202" "1020202001"
      "022" "17:00" "18:00" "2" ""  "1" "361" "2" "03" ""  "1020202" "1020202001"
      "023" "18:00" "19:00" "2" ""  "1" "750" "2" "01" ""  "1020202" "1020202001"
      "024" "19:00" "21:00" "1" "2" "1" "722" "2" "01" ""  "1020202" "1020202001"
      "025" "19:00" "21:00" ""  ""  "2" "921" "2" "01" ""  "1020202" "1020202001"
      "026" "21:00" "22:00" "2" ""  "1" "750" "2" "01" ""  "1020202" "1020202001"
      "027" "22:00" "04:00" "2" ""  "1" "911" "1" "01" ""  "1020202" "1020202001"
      "001" "04:00" "05:00" "2" ""  "1" "911" "1" "01" ""  "1020202" "1020202002"
      "002" "05:00" "06:00" "1" "2" "1" "361" "2" "03" ""  "1020202" "1020202002"
      "003" "05:00" "06:00" ""  ""  "2" "931" "1" "01" ""  "1020202" "1020202002"
      "004" "05:00" "06:00" ""  ""  "2" "311" "1" "03" ""  "1020202" "1020202002"
      "005" "06:00" "07:00" "1" "2" "1" "321" "1" "03" ""  "1020202" "1020202002"
      "006" "06:00" "07:00" ""  ""  "2" "322" "1" "03" ""  "1020202" "1020202002"
      "007" "06:00" "07:00" ""  ""  "2" "922" "1" "01" ""  "1020202" "1020202002"
      "008" "07:00" "07:30" "2" ""  "1" "411" "1" "02" ""  "1020202" "1020202002"
      "009" "07:30" "08:30" "1" "1" "1" "311" "1" "03" ""  "1020202" "1020202002"
      "010" "07:30" "08:30" ""  ""  "2" "312" "1" "03" ""  "1020202" "1020202002"
      "011" "08:30" "09:00" "1" "1" "1" "921" "1" "01" ""  "1020202" "1020202002"
      "012" "08:30" "09:00" ""  ""  "2" "922" "1" "01" ""  "1020202" "1020202002"
      "013" "09:00" "10:00" "1" "1" "1" "411" "1" "02" ""  "1020202" "1020202002"
      "014" "09:00" "10:00" ""  ""  "2" "842" "1" "01" ""  "1020202" "1020202002"
      "015" "10:00" "11:00" "2" ""  "1" "711" "1" "01" ""  "1020202" "1020202002"
      "016" "11:00" "12:00" "1" "1" "1" "311" "1" "03" ""  "1020202" "1020202002"
      "017" "11:00" "12:00" ""  ""  "2" "313" "1" "03" ""  "1020202" "1020202002"
      "018" "12:00" "12:30" "1" "1" "1" "842" "1" "01" ""  "1020202" "1020202002"
      "019" "12:00" "12:30" ""  ""  "2" "711" "1" "01" ""  "1020202" "1020202002"
      "020" "12:30" "13:00" "2" ""  "1" "361" "2" "03" ""  "1020202" "1020202002"
      "021" "13:00" "13:30" "1" "2" "1" "411" "1" "02" ""  "1020202" "1020202002"
      "022" "13:00" "13:30" ""  ""  "2" "312" "1" "03" ""  "1020202" "1020202002"
      "023" "13:30" "15:00" "2" ""  "1" "711" "1" "01" ""  "1020202" "1020202002"
      "024" "15:00" "16:30" "2" ""  "1" "413" "1" "02" ""  "1020202" "1020202002"
      end
      Using your instruction I realised that I was using the wrong code to reshape my data. Now I'm using a new code mentioned below.

      Code:
      reshape wide time_from time_to multiact simultact majoract activity_code placeact paidact enterprise, i(PID) j(activity_no) string
      Am I using the correct code? actually, the result I wanted to get is something like for each unique PID the entire activity_no comes in a single row along with the type of activity such as whether it is major_act, minor_act, simult_act etc.

      ThankYou.

      Comment


      • #4
        Yes, that code gives you those results.

        I will reiterate that this arrangement of the data may well prove burdensome as you work with it further in Stata and you will wish you had not done this. Fortunately, if that happens, it is easy enough to restore the original layout by just typing -reshape long-.

        Comment


        • #5
          You are right sir. Actually, I need to merge 1:1 PID with another data set with the same PID which is in wide form. So, I thought to use reshape wide command. I hope I'm in the right direction.
          Last edited by Iqbal Naved; 06 Jul 2021, 15:27.

          Comment


          • #6
            Well, an alternative approach is to -reshape- the other data set to long layout and then -merge 1:1 PID activity_no-. That will leave you with a long data set containing the information from both of those data sets. And, as I can't say often enough, long data layout usually works much better than wide in Stata.

            Comment


            • #7
              Thank you, sir. I shall use -reshape long- for my other data set. Thanks again for your valuable suggestion.

              Comment

              Working...
              X