Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reshape - variable names currently as values

    I have problems with data structure and would like to reshape into wide.
    Currently I have the name of var1 in line 1 in the variable "question" and the associated value in line1 in the var. "answer". In line 15 I have question 1 for person 2. etc.

    Is it possible to transform this into a wide data structure (i.e. question

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str68 question str52 answers double id float date
    "Anbefalet tidspunkt for påbegyndelse af genoptræning:"              "2019-02-14 00:00:00.0000000"                          1 21588
    "Borger indforstået med henvendelse"                                  "Nej"                                                  1 21588
    "Er udskrivningsdato angivet:"                                         "Ja"                                                   1 21588
    "Genoptræningsniveau:"                                                "avanceret"                                            1 21588
    "Genoptræningsplan modtaget fra?"                                     "sydvestjysk sygehus – esbjerg, grinsted og brørup" 1 21588
    "Henvendelseskilde"                                                    "Pårørende"                                          1 21588
    "Henvisning/henvendelse modtaget"                                      "2019-02-08 00:00:00.0000000"                          1 21588
    "Henvist til:"                                                         "§140 genoptræning"                                  1 21588
    "Leverandør"                                                          "Geriatrisk team"                                      1 21588
    "Overordnet diagnose §86/140"                                         "o - knæ"                                             1 21588
    "Supplerende oplysninger"                                              "suppl.oplysnin"                                       1 21588
    "Tilbudt førstegangsbesøg / Tidspunkt for opstart af genoptræning:" "2019-02-15 00:00:00.0000000"                          1 21588
    "Uddyb kilde med navn, telefonnummer mm"                               "kildetekst"                                           1 21588
    "Udskrivningsdato:"                                                    "2019-01-30 00:00:00.0000000"                          1 21588
    "Anbefalet tidspunkt for påbegyndelse af genoptræning:"              "2019-02-14 00:00:00.0000000"                          2 21588
    "Borger indforstået med henvendelse"                                  "Ja"                                                   2 21589
    "Er udskrivningsdato angivet:"                                         "Nej"                                                  2 21589
    "Genoptræningsniveau:"                                                "Basal"                                                2 21589
    "Genoptræningsplan modtaget fra?"                                     "OUH"                                                  2 21589
    "Henvendelseskilde"                                                    "Læge"                                                 2 21589
    "Henvisning/henvendelse modtaget"                                      "2019-03-08 00:00:00.0000000"                          2 21589
    "Henvist til:"                                                         "§190"                                                2 21589
    "Leverandør"                                                          "Neurologisk team"                                     2 21589
    "Overordnet diagnose §86/140"                                         "H - hofte"                                            2 21589
    "Supplerende oplysninger"                                              "suppl. oplysninger person 2"                          2 21589
    "Tilbudt førstegangsbesøg / Tidspunkt for opstart af genoptræning:" "2019-02-18 00:00:00.0000000"                          1 21589
    "Uddyb kilde med navn, telefonnummer mm"                               "kildetekst person2"                                   2 21589
    "Udskrivningsdato:"                                                    "2019-02-30 00:00:00.0000000"                          2 21589
    end
    format %td date

    I would like something like this:
    ID date Anbefalet tidspunkt for påbegyndelse af genoptræning: Borger indforstået med henvendelse etc etc
    1 08feb2019 2019-02-14 00:00:00.0000000 Nej
    2 09feb2019 2019-02-14 00:00:00.0000000 Ja

  • #2
    The code is simpler if you don't need to preserve the questions in full, but I think this works:

    Code:
    gen j = substr(strtoname(question), 1, 31)
    qui levelsof question, loc(labels)
    drop question
    rename answers _
    reshape wide _, i(id date) j(j) string
    rename _* *
    
    * This tries to preserve the question names; it relies on the sort order
    * being unique by the abbreviated variable names
    
    unab vars: _all
    local i id date
    local vars: list vars - i
    local vars: list sort vars
    forvalues i = 1 / `:list sizeof vars' {
        gettoken lbl labels: labels
        gettoken var vars:   vars
        label var `var' `"`lbl'"'
    }

    Comment


    • #3
      Thanks a lot. The first 5 lines solves the my primary isseues regarding data structure.
      . rename _**, however promps the error _ ambiguous abbreviation ? But except from that the fihe first section works really well.

      I am able to run the commands in the second section (from unab vars: all) but I cannot see that it makes any changes and cannot really understand what these commands are intended to do.

      Comment


      • #4
        "rename _**" is not the syntax; rather, it is " rename _* *" (notice the space). It is meant to remove the leading underscore from your variable names.

        The second section applies labels to the variables; it should be run in the same script as the first section. Since variable names can only be up to 32 characters and they cannot contain certain characters, some information is lost with the reshape. However, you can label each variable to indicate what it's original name was; you should be able to type "desc" and see the variable labels contain the original questions in full.

        Comment


        • #5
          OK thanks a lot for the elaboration

          Comment

          Working...
          X