Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Merging variable labels with label values

    Hi,
    I was wondering if there is a way to merge the variable names (labels) in the first column with the label values in the last column.
    Any help would be greatly appreciated.
    Thank you,
    Nikos
    Variable Name Data Type Value Codes
    WHO_1 Numeric 1 = Mand
    2 = Kvinde
    WHO_2 Numeric 1 = By
    2 = Villa-område
    3 = Landdistrikterne
    WHO_3 Numeric none
    Crohn_1 Numeric none
    Crohn_2 Numeric none
    GQ_1a Numeric 1 = Nej - jeg tager ingen andre lægemidler
    2 = Ja
    GQ_1b String none
    GQ_2a Numeric 1 = Ingen andre lægemidler på en regelmæssig basis
    2 = Ja
    GQ_2b String none
    GQ_3a Numeric 1 = Ja
    2 = Nej
    GQ_3b_1 Numeric 1 = Yes
    2 = No
    GQ_3b_2 Numeric 1 = Yes
    2 = No
    GQ_3b_3 Numeric 1 = Yes
    2 = No
    GQ_3b_4 Numeric 1 = Yes
    2 = No

  • #2
    I'm not sure what you are looking for but I think that -numlabel- might get you at least partway there; see
    Code:
    help numlabel

    Comment


    • #3
      Do you want the value labels to appear inside the variable label? That could work for your examples but not in general, as there would not be enough space.

      Comment


      • #4
        Thank you both.

        -I would like to append the values in the third column as values of the variable label in the first column.

        -Then create a label list which I can use for the dataset with the same var names as in the first column. However, the variables in the dataset will be in columns one next to the other.
        There over 150 variables with labels and values per level and I was looking into automating the process instead of doing the job manually.

        At the end,for example, if I tabulate the first var
        tab WHO_1
        I would like to get:

        WHO_1 | Freq. Percent Cum.
        ------------+-----------------------------------
        Mand | xx xx xx
        Kvinde | xx xx xx
        ------------+-----------------------------------
        Total | xx xx

        Thank you,

        Nikos

        Comment


        • #5
          Sorry, I am still lost here. You seem to be asking for something unusual when the output in #4 will arise in a simple (and common) way.

          Code:
          clear 
          input WHO_1 
          1
          1
          1
          2
          2
          end 
          
          label def WHO_1 1 Mand 2 Kvinde
          label val WHO_1 WHO_1 
          
          tab WHO_1 
          
                WHO_1 |      Freq.     Percent        Cum.
          ------------+-----------------------------------
                 Mand |          3       60.00       60.00
               Kvinde |          2       40.00      100.00
          ------------+-----------------------------------
                Total |          5      100.00

          Comment


          • #6
            Thank you and sorry for the confusing posting
            The 3 columns in posting #1 are in Excel and I was wondering if somehow I could "attach" on the variable label in col1 the label values from col3.
            I would need what you are showing on the last post but do over all vars in an automated way instead of one at a time
            I hope it is more clear now?
            Thank you

            Comment


            • #7
              it is clearer now, thanks. It wasn't obvious to me that #1 was an abstraction of something in a spreadsheet.

              But sorry, I can't get interested in what you have in Excel. My Excel expertise is also negligible.

              Import it into Stata and show us a data example.

              But it seems to me that you have -- based on what you show us, and we don't know about anything else -- a short job writing a do file based on what you have in #1 or a long job trying to program conversion in generality. Here "short" and "long" depend on how much you know about Stata programming.

              If the value labels are all crunched into single cells in Excel, you will have a hard task getting them out.

              Comment


              • #8
                Normally, you would need to define a value label manually for every variable, and then associate it to your variables. However, you can assign one value label to multiple variables.

                For instance you have multiple yes/no answers:
                Code:
                label define yn 1 "Yes" 2 "No"
                lab val GQ_3b_1 yn
                lab val GQ_3b_2 yn 
                lab val GQ_3b_3 yn
                Etc. etc.

                An alternative method would be to import your data from Excel if you have a version where the value labels are used in the cells instead of the numeric values. That will give you strings. Should these be strictly categorical variables with no particular order, you could then encode the variables. Look at help encode.

                Comment


                • #9
                  Thank you Nick and I was not asking about Excel although it did sound like it. I was referring to the looks of the dataset

                  Below is part of the dataset in the format it will be worked:
                  Code:
                  * Example generated by -dataex-. To install: ssc install dataex
                  clear
                  input int id byte(who_1 who_2 who_3)
                   5 2 1 24
                   8 2 1 28
                  11 2 2 66
                  13 2 2 33
                  19 2 1 53
                  end
                  Below is an example of the dataset that contains the label names and the label values imported in Stata:

                  Code:
                  * Example generated by -dataex-. To install: ssc install dataex
                  clear
                  input str5 var str8 label
                  "WHO_1" "1=Mand " 
                  " " "2=Kvinde"
                  "WHO_2" "1=By"    
                  " " "2=Villa" 
                  " " "3-Land"  
                  end
                  In the label dataset I would like to attach the label values from the 2nd column to the labels in the first column, if possible. It looks messy.
                  Then create a label list/file? so that I can attach to the variables in the first dataset on the top. I hope this makes sense.

                  Thank you,

                  Nikos

                  Comment


                  • #10
                    A lot depends on the details. For example, in your initial post, you claimed to have

                    Code:
                    3 = Landdistrikterne
                    but in the example, this turns into

                    Code:
                    3-Land
                    with the equals sign changed to a dash and the spaces gone. Such details are important.

                    Here is a general approach. I assume that the pattern in label is #=text. I also use elabel (SSC) which can define value labels from datesets created by uselabel.

                    Code:
                    // your example
                    * Example generated by -dataex-. To install: ssc install dataex
                    clear
                    input str5 var str8 label
                    "WHO_1" "1=Mand " 
                    " " "2=Kvinde"
                    "WHO_2" "1=By"    
                    " " "2=Villa" 
                    " " "3=Land"  
                    end
                    
                    // we start form here
                    list
                    
                    // plan: we will mimic a dataset created by -uselabel-
                    
                    // 1. repeat the label name ...
                    replace var = var[_n-1] if mi(strtrim(var))
                    
                    // ... rename -lname- (match -uselabel-) ...
                    rename var lname
                    
                    // ... make the names match the lowercase variable names ...
                    replace lname = strlower(lname)
                    
                    // 2. now create -value- ...
                    generate value = real(substr(label, 1, strpos(label, "=")-1))
                    
                    // ... and put after -label- (match -uselabel-)
                    order lname value label
                    
                    // 3. strip integer from label
                    replace label = substr(label, strpos(label, "=")+1, .)
                    
                    // 4. add variable -trunc- (match -uselabel-)
                    generate trunc = (strlen(label)>244)
                    
                    // this is what we have now
                    list
                    
                    // save this as temporary file
                    tempfile tmp
                    save "`tmp'"
                    
                    // now load your dataset (here, we create it from -dataex-; you would just load it with -use-)
                    
                    * Example generated by -dataex-. To install: ssc install dataex
                    clear
                    input int id byte(who_1 who_2 who_3)
                     5 2 1 24
                     8 2 1 28
                    11 2 2 66
                    13 2 2 33
                    19 2 1 53
                    end
                    
                    // use -elabel- to create the value labels
                    * ssc install elabel
                    elabel load using "`tmp'"
                    
                    // attach the value labels to same named variables
                    label dir
                    elabel values (`r(names)') (`r(names)')
                    
                    // done
                    describe
                    list
                    This results in

                    Code:
                    . // we start form here
                    . list
                    
                         +------------------+
                         |   var      label |
                         |------------------|
                      1. | WHO_1    1=Mand  |
                      2. |         2=Kvinde |
                      3. | WHO_2       1=By |
                      4. |          2=Villa |
                      5. |           3=Land |
                         +------------------+
                    
                    . 
                    . // plan: we will mimic a dataset created by -uselabel-
                    . 
                    (output omitted)
                    . 
                    . // this is what we have now
                    . list
                    
                         +--------------------------------+
                         | lname   value    label   trunc |
                         |--------------------------------|
                      1. | who_1       1    Mand        0 |
                      2. | who_1       2   Kvinde       0 |
                      3. | who_2       1       By       0 |
                      4. | who_2       2    Villa       0 |
                      5. | who_2       3     Land       0 |
                         +--------------------------------+
                    
                    . 
                    . // save this as temporary file
                    . tempfile tmp
                    
                    . save "`tmp'"
                    
                    (output omitted)
                    
                    . // use -elabel- to create the value labels
                    . elabel load using "`tmp'"
                    
                    . 
                    . // attach the value labels to same named variables
                    . label dir
                    who_1
                    who_2
                    
                    . elabel values (`r(names)') (`r(names)')
                    
                    . 
                    . // done
                    . describe
                    
                    Contains data
                      obs:             3                          
                     vars:             4                          
                     size:            27 (99.9% of memory free)
                    ---------------------------------------------------------------------------------------------------------------------------------------------------------------------
                                  storage  display     value
                    variable name   type   format      label      variable label
                    ---------------------------------------------------------------------------------------------------------------------------------------------------------------------
                    id              int    %8.0g                  
                    who_1           byte   %8.0g       who_1      
                    who_2           byte   %8.0g       who_2      
                    who_3           byte   %8.0g                  
                    ---------------------------------------------------------------------------------------------------------------------------------------------------------------------
                    Sorted by:  
                         Note:  dataset has changed since last saved
                    
                    . list
                    
                         +-----------------------------+
                         | id    who_1   who_2   who_3 |
                         |-----------------------------|
                      1. | 11   Kvinde   Villa      66 |
                      2. | 13   Kvinde   Villa      33 |
                      3. | 19   Kvinde      By      53 |
                         +-----------------------------+
                    Best
                    Daniel

                    Comment


                    • #11
                      Thank you Daniel.
                      I will check your suggestion out.
                      Best wishes,
                      Nikos

                      Comment

                      Working...
                      X