Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • get specific digit

    Helli, I'm a new user of stata and I need your support please. I have a variable refered to the id number of participant. It has 4 , 5 or 6 digits. I want to get only the fourth digit,
    since it means how many times this participant get a drug intervention, the other digits mean region, gender etc but I don't want such information. The type of this variable is double.
    Could you help me please?

  • #2
    Convert values to string; extract the fourth character, and convert back to number.

    Code:
     
    gen char4 = real(substr(string(id), 4, 1))

    Please follow FAQ Advice. Full real names (given name and family name) are preferred here.

    Comment


    • #3
      Assuming no need for padding,

      Code:
      ​tostring x, gen(y)
      gen fourth=substr(y, 4,1)

      Comment


      • #4
        I tried to convert double to string variable and it appears the message "cannot be converted reversibly" but when I check force convertion ignoring information loss, it runs normaly, is it ok?
        I cannot see how information may be lost..

        Comment


        • #5
          Consider the following example:

          Code:
          . clear
          
          . input double x
          
                        x
            1. 1234567890
            2. 12345678901
            3. end
          
          . format x %13.0g
          
          .
          . tostring x, gen(xstring) force
          xstring generated as str11
          xstring was forced to string; some loss of information
          
          . list
          
               +---------------------------+
               |           x       xstring |
               |---------------------------|
            1. |  1234567890    1234567890 |
            2. | 12345678901   1.23457e+10 |
               +---------------------------+
          Notice that the string version of x in the second observation is \(1.234567 \times 10^{10} = 12345670000 \neq 12345678901\) Also notice that the 4th character in the string 1.234567e+10 is 3 not 4. To solve this you need to specify the format() options such that your string no longer is in exponential format. The default is format(%12.0g), so you could experiment with format(%13.0g), format(%14.0g), ... till you find no exponential format strings. Alternatively, if your id variable is an integer (which should be the case in order to avoid all kinds of precision problems) you can compute the right format. Continuing the example above

          Code:
          . // test whether x is an integer
          . assert mod(x,1) == 0 if !missing(x)
          
          .
          . // compute the right format
          . gen temp = log10(abs(x))
          
          . sum temp, meanonly
          
          . di ceil(r(max)) + 2
          13
          
          . drop temp
          
          . // use that format
          . tostring x, gen(xtring2) format(%13.0g)
          xtring2 generated as str11
          
          . list
          
               +-----------------------------------------+
               |           x       xtring2       xstring |
               |-----------------------------------------|
            1. |  1234567890    1234567890    1234567890 |
            2. | 12345678901   12345678901   1.23457e+10 |
               +-----------------------------------------+
          ---------------------------------
          Maarten L. Buis
          University of Konstanz
          Department of history and sociology
          box 40
          78457 Konstanz
          Germany
          http://www.maartenbuis.nl
          ---------------------------------

          Comment


          • #6
            Is it really correct that your double variable does not exceed 999999? tostring won't complain with 6 or fewer digits. Nor would the simpler code I suggested not work.

            Code:
            . set obs 1
            obs was 0, now 1
            
            . gen double x = 999999
            
            . tostring x, replace
            x was double now str6

            Comment

            Working...
            X