Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • label values from one variable to another

    Dear All,

    I have a small problem. I have been trying to something for a while but it is not is solved. I want to assign values from one variable as labels to another. For instance, there is a variable XYZ with 0.01, 0.03, 0.05 etc. as float. and there is another with 1, 2, 3 etc. for each respective value of XYZ. I want to label values 1, 2 as 0.01 and 0.03 respectively (for eg.).

    I tried using a command labmask but it didnt work well, and I cannot understand why. It uses the label 0.06 for instance but displays .0599999986588955.
    The code I ran was :

    Code:
    egen unemp_cat = cut(unemp_current), at(0.03[0.01]0.18)
    egen unemp_cat1 = cut(unemp_current), at(0.03[0.01]0.18) icodes
    labmask unemp_cat1, values(unemp_cat)
    Click image for larger version

Name:	Problem.png
Views:	1
Size:	418.7 KB
ID:	1406557



    Thank you,
    Pranav
    Last edited by Pranav Garg; 16 Aug 2017, 03:39.

  • #2
    Please review the Statalist FAQ

    https://www.statalist.org/forums/help#stata

    which is explicit that screenshots are not as helpful as you hope and that you should explain where user-written programs come from.

    labmask (from the Stata Journal) did what you asked; the problem is that is that is not what you want.

    Most numbers like 0.05 that are decimal multiples of 0.01 cannot be held exactly as binaries. In fact, only .., 0.00, 0.25, 0.50, 0.75, .... can be so held. This is one of the most common problems discussed on this list -- and the explanations are numerous, starting with sources found by

    Code:
    search precision
    -- but many bitten by it are surprised.

    The Statalist FAQ also links to http://xyproblem.info/ which advises asking about your real problem, not about problems with solutions that didn't work. Here it is seems that you want bins of width 0.01 defined by their lower limits. Here is some technique towards that end.


    Code:
    . clear
    
    . set obs 10
    number of observations (_N) was 0, now 10
    
    . set seed 2803
    
    . gen have = 0.05 + 0.1 * runiform()
    
    . list
    
         +----------+
         |     have |
         |----------|
      1. | .1424379 |
      2. | .0832634 |
      3. | .1273969 |
      4. |  .060408 |
      5. | .0838393 |
         |----------|
      6. | .0520022 |
      7. | .0679559 |
      8. | .1126451 |
      9. | .0898043 |
     10. | .0887058 |
         +----------+
    
    . gen want = string(floor(100 * have) / 100, "%03.2f")
    
    . sort have
    
    . l
    
         +-----------------+
         |     have   want |
         |-----------------|
      1. | .0520022   0.05 |
      2. |  .060408   0.06 |
      3. | .0679559   0.06 |
      4. | .0832634   0.08 |
      5. | .0838393   0.08 |
         |-----------------|
      6. | .0887058   0.08 |
      7. | .0898043   0.08 |
      8. | .1126451   0.11 |
      9. | .1273969   0.12 |
     10. | .1424379   0.14 |
         +-----------------+

    Comment


    • #3
      Thank you. The new code worked perfectly for my problem.

      1. I completely understand what you mean by http://xyproblem.info/ and it was a simple logical thinking that escaped me : I should talk specifically about my original requirement first, and then write the method I used. my problem was indeed not with labmask)

      2. I do not completely understand the issue with 0.01 under labmask. I shall read further documents at search precision.

      Thank you,
      Pranav

      Comment


      • #4
        You're asking for the numeric values of one variable to become the value labels of another variable. They won't be filtered by any display format.

        Here is your problem in a self-contained example.

        Code:
        . clear
        
        . set obs 1 
        
        . gen x = 0.05
        
        . gen y = 1
        
        . labmask y, values(x)
        
        . l
        
             +-------------------------+
             |   x                   y |
             |-------------------------|
          1. | .05   .0500000007450581 |
             +-------------------------+
        With a float variable, the closest Stata can get to 0.05 is the binary equivalent of the number displayed. You can get closer with a double but you won't solve the problem. If you want to "see" values like 0.05 you must push them through a display format which yields strings, not numeric values.

        Comment


        • #5
          I see that even in my data 0.05 was read as .0500000007450581.
          So the value .0500000007450581 was the binary equivalent of 0.05 according to STATA?

          Thus, if I understand correctly:
          labmask can only work well for values that STATA can read well (I don't know if what I understand makes sense)? It can work on 5 (or on 0.5?) but not on 0.05?


          Pranav
          Last edited by Pranav Garg; 16 Aug 2017, 07:53.

          Comment


          • #6
            I don't see here a different question. Like any program, labmask does what you ask to the best of its ability. It's not a case of working well (or badly) if you ask something that you later realise you don't want or if you don't understand how the program works.

            The question for the user (you) is: What value labels do I want to be visible?

            labmask is mute on whether your choice is good.

            labmask pays no attention to whether values are "suitable" value labels. It has no inner sense of aesthetics or document design.

            As already pointed out, you won't get nice or rounded numeric values unless they are already defined. labmask doesn't refine your choices.

            We're back to the X-Y problem as I still lack a sense of why you want to do this.

            To understand precision, you need more rigorous distinctions between decimal representations that you see, binary representations that Stata uses, and string representations. Thus consider your statement

            0.05 was read as .0500000007450581.
            Stata did read at some point "0.05" or something similar as characters, meaning decimal digits, signalling what you want. But it can't hold such a value exactly as a stored numeric value. What it can do is hold as binary a number that when shown converted back to a decimal is (very close to) what you see as the more complicated decimal above.
            Last edited by Nick Cox; 16 Aug 2017, 08:24.

            Comment


            • #7
              labmask pays no attention to whether values are "suitable" value labels. It has no inner sense of aesthetics or document design.

              [...]

              What it can do is hold as binary a number that when shown converted back to a decimal is (very close to) what you see as the more complicated decimal above.
              I think this explains the process behind what had happened to me well.

              Thank you for the code, and your detailed comments. These are very helpful to a beginner who is still learning the logic behind some ways in which the software operates differently than how I intended it to, for no fault of its own.

              Comment

              Working...
              X