Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating value labels for int-variable x from str-variable y

    I have a datset consisting of individuals, and then I have information on their education. One variable 'x' shows the type of education with numbers ranging from 1 to 100. Then I have another variable showing the type of education with text.

    Dataexample:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str2 id float x str9 y
    "1" 1 "Carpenter"
    "2" 1 "Carpenter"
    "3" 2 "Mechanic"
    "4" 5 "Astronaut"
    end

    Is there a way where I can define value labels for x using the text value of y?

  • #2
    You can use Nick Cox's labmask for that. Type in Stata search labmask and follow the links to install it. After that you can do something like this (The list commands are only there to show that it works. They are not necessary and in large datasets they will produce lots and lots of output):

    Code:
    . labmask x, val(y)
    
    . list
    
         +----------------------------+
         | id           x           y |
         |----------------------------|
      1. |  1   Carpenter   Carpenter |
      2. |  2   Carpenter   Carpenter |
      3. |  3    Mechanic    Mechanic |
      4. |  4   Astronaut   Astronaut |
         +----------------------------+
    
    . list, nolabel
    
         +--------------------+
         | id   x           y |
         |--------------------|
      1. |  1   1   Carpenter |
      2. |  2   1   Carpenter |
      3. |  3   2    Mechanic |
      4. |  4   5   Astronaut |
         +--------------------+
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      Thanks for the mention to Maarten Buis My preference as both author and an Editor is that the Stata Journal be regarded as the reference.

      SJ-8-2 gr0034 . . . . . . . . . . Speaking Stata: Between tables and graphs
      (help labmask, seqvar if installed) . . . . . . . . . . . . N. J. Cox
      Q2/08 SJ 8(2):269--289
      outlines techniques for producing table-like graphs

      Comment


      • #4
        Too bad my dataset is located on a secured remote desk implying that I cannot access the internet to download user written program from my Stata session, but this definitely looks like the solution to my problem, so I will keep it in mind for another time.

        Comment


        • #5
          It is not very long. Can you type

          Code:
          ssc type labmask.ado
          and then copy the results accessibly?

          Comment


          • #6
            Assuming that the data is set up correctly, which labmask checks for you, and stripping all bells and whistles, the core-process boils down to two lines:

            Code:
            mata : st_vlmodify("xlbl", st_data(., "x"), st_sdata(., "y"))
            label values x xlbl

            Comment


            • #7
              Originally posted by Nick Cox View Post
              It is not very long. Can you type

              Code:
              ssc type labmask.ado
              and then copy the results accessibly?
              Nick, this is the result:

              Code:
              . ssc type labmask.ado
              host not found
              r(631);

              Comment


              • #8
                In short, you can't do that. So Daniel klein's suggestion is likely to be more practical for you.

                FWIW, labmask was written for Stata 7 in 2002 -- before Mata was implemented. Daniel's complete message is important: there are checks in labmask. It's an interesting question what that Mata code does if there aren't one-to-one mappings.

                Code:
                bysort x (y) : assert y[1] == y[_N]
                checks for that.

                Comment


                • #9
                  Originally posted by Nick Cox View Post
                  Daniel's complete message is important: there are checks in labmask.
                  labmask also handles if and in qualifiers, can work with value labels (decode), etc.


                  Originally posted by Nick Cox View Post
                  It's an interesting question what that Mata code does if there aren't one-to-one mappings.
                  The mapping of the same string to different integer values is not a problem. If the same integer is mapped to different strings, Mata's st_vlmodify() keeps the last occurrence; so sort order is relevant.

                  By the way, non-integers are truncated.

                  Edit: Fun fact, you can attach labels to system missing values (.) with st_vlmodify(); those labels are not displayed, though.
                  Last edited by daniel klein; 12 May 2022, 03:47.

                  Comment


                  • #10
                    Otherwise put, st_vlmodify() does not check for inconsistencies. It doesn't claim to, but watch out.

                    Code:
                    * Example generated by -dataex-. For more info, type help dataex
                    clear
                    input str2 id float x str10 y
                    "1" 1 "Carpenter"
                    "2" 1 "Programmer"
                    "3" 2 "Mechanic"
                    "4" 5 "Astronaut"
                    end
                    
                    mata : st_vlmodify("xlbl", st_data(., "x"), st_sdata(., "y"))
                    label values x xlbl
                    
                    label li 
                    
                    list 
                    
                    
                         +------------------------------+
                         | id            x            y |
                         |------------------------------|
                      1. |  1   Programmer    Carpenter |
                      2. |  2   Programmer   Programmer |
                      3. |  3     Mechanic     Mechanic |
                      4. |  4    Astronaut    Astronaut |
                         +------------------------------+

                    Comment


                    • #11
                      In this respect, st_vlmodify() is not any different from Stata's label define

                      Code:
                      . label define foo 42 "foo" 42 "bar"
                      
                      . label list foo
                      foo:
                                42 bar
                      
                      .

                      Comment

                      Working...
                      X