Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Label data

    Hi All,

    I have a csv file which contains the following data:
    Identifiant Education Gender
    11112 1 0
    11113 2 1
    11114 3 1
    11115 3 1
    11116 4 0
    The file is accompanied with this codebook:
    Variable Values Label
    Education 1 Primary
    Education 2 Junior School
    Education 3 High School
    Education 4 College / or University
    Gender 0 Male
    Gender 1 Female
    I would like to label data in the first table using values from the second table in Stata 13. I have 2K variables in my data and I would like to write a do-file that will help me to do all of that at once.

    I will be grateful for any comments and suggestions.

    Best regards,
    Antoine

  • #2

    Welcome to the Stata Forum / Statalist,

    With regards to the do-file, you can do this by using a loop, having selected a sequence of variables. For example. if we know that all variables from var1 to var1000 are binary, we may just use in the varlist the term var1-var1000 and get the labels for all of them.
    Best regards,

    Marcos

    Comment


    • #3
      Thank you Marcos for your reply. I am not sure to understand your suggestion. My data set contains 2K variables and I would like to label the values of each variable. So, regarding your suggestion I have to do this manually?

      Comment


      • #4
        I assume that the "codebook" is just another csv-file. Here is an approach using elabel (SSC).

        Code:
        // get the label dataset
        clear
            // add options as appropriate for your csv-file
        import delimited using csv_filename2
        
        // this is how the label dataset looks like
        list
        
        // I will use -elabel- below
        *ssc install elabel
        
        // change the dataset to mimic a dataset created by -uselabel-
        rename (variable values)(lname value)
        generate trunc = 0
            // this is how it looks like
        list
        
        // now save this file (using a temporary filename)
        tempfile tmp
        save "`tmp'"
        
        // define value labels from the tempfile
        elabel load using "`tmp'"
        
        // now we have the labels defined
        label list
        
        // we will save the value label definitions
        label save using "`tmp'" , replace
        
        // now we get the data (the other csv-file)
        clear
            // add options as appropriate for your csv-file
        import delimited using csv_filename1 , case(preserve)
        
        // this is the unlabeled dataset
        list
        
        // now we bring in the labels
        do "`tmp'"
        
        // and attach the labels to same-named variables
        elabel unab lblnames : *
        elabel values (`lblnames')(`lblnames')
        
        // this is the labeled dataset
        list
        Here is the (selected) output

        Code:
        (output omitted)
        
        . // this is how the label dataset looks like
        . list
        
             +----------------------------------------------+
             |  variable   values                     label |
             |----------------------------------------------|
          1. | Education        1                   Primary |
          2. | Education        2             Junior School |
          3. | Education        3               High School |
          4. | Education        4   College / or University |
          5. |    Gender        0                      Male |
             |----------------------------------------------|
          6. |    Gender        1                    Female |
             +----------------------------------------------+
        
        (output omitted)
        
        . // change the dataset to mimic a dataset created by -uselabel-
        . rename (variable values)(lname value)
        
        . generate trunc = 0
        
        .     // this is how it looks like
        . list
        
             +-----------------------------------------------------+
             |     lname   value                     label   trunc |
             |-----------------------------------------------------|
          1. | Education       1                   Primary       0 |
          2. | Education       2             Junior School       0 |
          3. | Education       3               High School       0 |
          4. | Education       4   College / or University       0 |
          5. |    Gender       0                      Male       0 |
             |-----------------------------------------------------|
          6. |    Gender       1                    Female       0 |
             +-----------------------------------------------------+
        
        (output omitted)
        
        . // define value labels from the tempfile
        . elabel load using "`tmp'"
        
        .
        . // now we have the labels defined
        . label list
        Education:
                   1 Primary
                   2 Junior School
                   3 High School
                   4 College / or University
        Gender:
                   0 Male
                   1 Female
        
        .
        . // we will save the value label definitions
        . label save using "`tmp'" , replace
        
        . // now we get the data (the other csv-file)
        . clear
        
        .     // add options as appropriate for your csv-file
        . import delimited using csv_filename1
        (3 vars, 5 obs)
        
        .
        . // this is the unlabeled dataset
        . list
        
             +------------------------------+
             | dentif~t   Educat~n   Gender |
             |------------------------------|
          1. |    11112          1        0 |
          2. |    11113          2        1 |
          3. |    11114          3        1 |
          4. |    11115          3        1 |
          5. |    11116          4        0 |
             +------------------------------+
        
        .
        . // now we bring in the labels
        . do "`tmp'"
        
        . label define Education 1 `"Primary"', modify
        
        . label define Education 2 `"Junior School"', modify
        
        . label define Education 3 `"High School"', modify
        
        . label define Education 4 `"College / or University"', modify
        
        . label define Gender 0 `"Male"', modify
        
        . label define Gender 1 `"Female"', modify
        
        .
        end of do-file
        
        .
        . // and attach the labels to same-named variables
        . elabel unab lblnames : *
        
        . elabel values (`lblnames')(`lblnames')
        
        .
        . // this is the labeled dataset
        . list
        
             +---------------------------------------------+
             | dentif~t                 Education   Gender |
             |---------------------------------------------|
          1. |    11112                   Primary     Male |
          2. |    11113             Junior School   Female |
          3. |    11114               High School   Female |
          4. |    11115               High School   Female |
          5. |    11116   College / or University     Male |
             +---------------------------------------------+
        I hope this helps.

        Best
        Daniel

        Comment


        • #5
          Thank you so much Daniel. Your code is PERFECT!

          Comment


          • #6
            Thank you again Daniel. Now, I want to label the variables using the following excel file and the data previously labelled:
            Variable name Value label Variable label
            Education Education Highest level of education
            Gender Gender Student's gender
            I tried to use the same code as you previously suggested to label the data but it does not work.

            Again, I will be grateful for any comments and suggestions.

            Best,
            Antoine

            Comment


            • #7
              Answered here.

              Best
              Daniel

              Comment

              Working...
              X