Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • reshape long to wide format

    I am stuck at this point. I would like to transform input data (long format) to wide format (output needed).



    Code:
    ****Input data****
    clear
    input id str5 medicine
    1 "med1"
    1 "med2"
    1 "med5"
    1 "med6"
    1 "med8"
    2 "med1"
    2 "med1"
    2 "med3"
    2 "med4"
    2 "med10"
    end
    
    list
    Code:
    *****Output needed****
    clear
    input id med1 med2 med3 med4 med5 med6 med8 med10
    1 1 1 0 0 1 1 1 0
    2 2 0 1 1 0 0 0 1
    end
    
    list

    I tried to find a unique identifier within a patient's id using _n or _N. It does not work. I do know what I am missing here.

    Code:
    bysort id medicine: gen subindex=_n
    bysort id medicine: gen submax=_N
    Any help in this regards highly appreciated.

    Thank you
    Last edited by Muhammad Faisal; 15 Apr 2019, 04:06.

  • #2
    You need to do a bit of work before and after you reshape, so far as I can see. In particular, there is no sense in which reshape knows about counting observations and those medicines not prescribed are all implicit.

    Code:
    ****Input data****
    clear
    input id str5 medicine
    1 "med1"
    1 "med2"
    1 "med5"
    1 "med6"
    1 "med8"
    2 "med1"
    2 "med1"
    2 "med3"
    2 "med4"
    2 "med10"
    end
    
    gen work = real(substr(medicine, 4, .))
    drop medicine
    contract id work, freq(med) zero 
    reshape wide med , i(id) j(work)
    
    list
    
         +-------------------------------------------------------------+
         | id   med1   med2   med3   med4   med5   med6   med8   med10 |
         |-------------------------------------------------------------|
      1. |  1      1      1      0      0      1      1      1       0 |
      2. |  2      2      0      1      1      0      0      0       1 |
         +-------------------------------------------------------------+
    Last edited by Nick Cox; 15 Apr 2019, 04:25.

    Comment


    • #3
      Thank you Nick! This is really useful.

      One more question is about medicine names. As in my original data, there are medicine names (which are String). Is there any way to use medicine names as variable names (instead of med1,med2,...).

      Comment


      • #4
        Yes; you can do that. Just give a data example.

        Comment


        • #5
          Something like this.

          Code:
          ****Input data****
          clear
          input id str20 medicine
          1 "Vicodin"
          1 "Simvastatin"
          1 "Lisinorpril"
          1 "Levothroxine"
          1 "Azithromycin"
          2 "Vicodin"
          2 "Vicodin"
          2 "Simvastatin"
          2 "Lipitor"
          2 "Amlodipine"
          end


          Code:
           *****Output needed****
          clear
          input id Vicodin Simvastin Lisinorpril Azithromycin Lipitor Amlodipine
          1 1 1 1 1 0 0
          2 2 1 0 0 1 1
          end  
          list

          Last edited by Muhammad Faisal; 15 Apr 2019, 06:43.

          Comment


          • #6
            Code:
            ****Input data****
            clear
            input id str20 medicine
            1 "Vicodin"
            1 "Simvastatin"
            1 "Lisinorpril"
            1 "Levothroxine"
            1 "Azithromycin"
            2 "Vicodin"
            2 "Vicodin"
            2 "Simvastatin"
            2 "Lipitor"
            2 "Amlodipine"
            end
            
            gen work = strtoname(medicine)
            contract id  work, freq(med) zero 
            reshape wide med , i(id) j(work) string 
            rename med* * 
            list

            Comment


            • #7
              Thank you very much!

              Comment

              Working...
              X