Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • access label of value

    Dear all,
    is there a way in stata to access the label of values in an if-condition?
    for example, if I wanted to count the number of observations where var1 has a value whose value label is A?

    something like
    count if label(var1) == "A"

    Thanks and best regards,
    Florian

  • #2
    Hello Florian and welcome to the list.

    I don't know any direct way to do it, and I don't know if some exists, but the following code does what you want :
    Code:
    decode var1,gen(lab_var1)
    count if lab_var=="A"
    It creates a new Stata variable (which labels are not) corresponding to your label (hence, a string variable), and allows you to add the if exp when using it.

    See a whole example here :

    Code:
    clear
    set obs 5
    gen num_var=_n
    
    label define lbl 1 "one" 2 "two" 3"three" 4"four" 5"five"
    
    label values num_var lbl
     
    decode num,gen(lab_var)
    count if lab_var=="one"


    Hope this helps,
    Charlie

    Comment


    • #3
      You can use the dereferencing syntax "<value>":<value label name>.

      If you don't know the name of the set of value labels for a particular variable (or you want to do it programmatically), then you can couple that syntax with an extended macro function (help extended_fcn) that will find it for you.

      Something like the following (at the "Begin here" comment).

      ÿversionÿ14.2

      .ÿ
      .ÿclearÿ*

      .ÿsetÿmoreÿoff

      .ÿquietlyÿsetÿobsÿ4

      .ÿgenerateÿbyteÿvar1ÿ=ÿmod(_n,ÿ2)

      .ÿlabelÿdefineÿValuesÿ0ÿAÿ1ÿB

      .ÿlabelÿvaluesÿvar1ÿValues

      .ÿ
      .ÿ*
      .ÿ*ÿBeginÿhere
      .ÿ*
      .ÿcountÿifÿvar1ÿ==ÿ"A":`:ÿvalueÿlabelÿvar1'
      ÿÿ2

      .ÿ
      .ÿexit

      endÿofÿdo-file


      .

      Comment


      • #4
        Joseph, in a perfect world that would be sufficient. But commonly you are dealing with 1 "small" 2 "small" 3 "small" etc, 97 "large" 98 "large" etc. Then Florian's question makes perfect sense.
        Charlie's answer would handle this situation, but it requires creation of a temporary variable to hold the whole value label, which will not work well in older Stata's since value labels can be up to 32,000 long.

        Best, Sergiy

        Comment


        • #5
          Tangential comment:

          But commonly you are dealing with 1 "small" 2 "small" 3 "small" etc, 97 "large" 98 "large" etc
          It's really interesting how we have very different experiences with data. In my entire career, I have never encountered a value label where the same label was used for different numeric codes, except when it turned out to be a mistake (and that usually only involved one label duplicated once.)

          Comment


          • #6
            Clyde, this stems from some data entry packages allowing application of labels to intervals, such as : 0-5 "Baby", 6-14 "Child", 15-60 "Adult", 61-125 "Senior". The resulting data set in Stata will have to have point labels to each of the integers from the interval (sadly Stata doesn't allow to label non-integers), resulting in massive number of duplicate labels.

            I agree, creating a second variable agegroup with these 4 categories sounds more like Stata-World solution, but it is a second variable occupying more dataspace (if anyone still cares) and potentially out of sync with the original age variable in case that one gets edited (that is what I do care about).

            Other popular choices are to provide skins:
            • Labelset A: 101 beef, 102 pork, 103 veal, ... 201 shark, 202 tuna, ...
            • Labelset B: 100-199 Meats, 200-299 Fish, ...
            With subsequent tabulations based on the "active" labels.

            "Commonly" is probably an exaggeration. I should have written "it is possible in general" or "it happens sometimes".

            Best, Sergiy

            Comment


            • #7
              Thanks for your comments, very helpful. In my case, I don't have the problem that different integers have the same Label...
              Best, Florian

              Comment

              Working...
              X