Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Ways to clean ICD9 codes

    I'm a newbie, and trying to clean up an ICD9 string variable that has some errant codes. An example is 530.809999999999999 which should be 530.81
    I've tried using recode, and this is what I get:

    . recode ICD9CODES1 (530.8099999999999 = 530.81)
    recode only allows numeric variables

    Replace isn't much better:
    . replace ICD9CODES1 = "530.81" if ICD9CODES1 = 530.8099999999999
    type mismatch
    r(109);

    I'm not sure if I first need to (or should) change the ICD9 code variable to a double and then try the recode command, or is there a better way?

  • #2
    look at "help precision"

    more important, since many icd9 codes have non-numeric characters you should have this variable as a string variable from the beginning

    Comment


    • #3
      Right, ICD9 should be a string for E and V codes, among others. I'm looking at help precision, which talks a lot about the differences between float and double, but l don't see anything about strings.

      I think the icd9 commands will need properly formatted codes to give descriptions, etc.
      Last edited by Steve Epstein; 04 Nov 2015, 10:14.

      Comment


      • #4
        Precision is a red herring here for string variables.

        Comment


        • #5
          Perhaps you are already aware of this, but if not: It is also worth looking into -help icd9-, Stata's set of ICD-9 commands (including commands to check the validity of and clean ICD-9 codes).
          Last edited by Brendan Cox; 04 Nov 2015, 10:51.

          Comment


          • #6
            Thanks, I have looked into -help icd9-. but it doesn't seem to indicate a way to correct the problem I'm having. I used the check function to find the various errors in the codes.

            Comment


            • #7
              I am confused - recode, which you use in #1 above, is for numeric, not string variables; please show the result of
              Code:
              d ICD9CODES1

              Comment


              • #8
                How were your ICD codes gotten into Stata? Did the original source contain string data with all those trailing 9's? If the original data contained valid ICD9 codes, than you should look into returning to that source rather than cleaning up whatever has become of them, as Rich suggests.

                With that said, since you tell us your ICD9CODES1 is a string variable, then your replace command was almost correct: the if clause needs to comare the string variable to a string constant.
                Code:
                replace ICD9CODES1 = "530.81" if ICD9CODES1 = "530.8099999999999"

                Comment


                • #9
                  Appreciate the help!

                  The codes were imported from Excel into Stata, and are in the Excel spreadsheet that way. Maybe round them in Excel first?

                  In any event, I tried the code you suggested - here's the result:

                  . replace ICD9CODES1 = "530.81" if ICD9CODES1 = "530.8099999999999"
                  type mismatch
                  r(109);

                  Comment


                  • #10
                    If the problem is systematic, perhaps something like
                    Code:
                    di string(round(real("530.8099999999999"), 0.00))
                    could help (using replace and the variable name in place of di and the actual string value; this example is for copying into the command line).

                    Comment


                    • #11
                      Not yet ...

                      . replace ICD9CODES1 string(round(real("530.8099999999999"), 0.00))
                      factor variables and time-series operators not allowed
                      r(101);

                      Comment


                      • #12
                        regarding William's suggestion in #8 and your implementation in #9: the second equals sign needs to be a set of 2 equals signs:
                        Code:
                        replace ICD9CODES1 = "530.81" if ICD9CODES1 == "530.8099999999999"
                        note that I have not attempted to ensure that what is above matches what you already have; I suggested you copy-and-paste into the above from an actual listing

                        if you have numerous such examples, you should tell us because you don't want to keep doing this

                        Comment


                        • #13
                          The problem in #11 is presumably that no equals sign is supplied for assignment.

                          Comment


                          • #14
                            Thanks, Rich, for catching my oversight at #8.

                            At #11, the replace command needs an equal sign after the variable name, and (based on my oversight in #8) perhaps something else that I have missed.


                            Comment


                            • #15
                              Unfortunately, there are more problems with

                              Code:
                              string(round(real("530.8099999999999"), 0.00))
                              Even in principle rounding to 2 d.p. requires an argument of 0.01, but that can't work because of precision problems.

                              Code:
                              display %3.2f  530.8099999999999
                              is a guaranteed way to get 2 d.p. as a string manipulation problem, although for an individual case like this it is easier to write down 530.81. In a program you should use string() with a format argument for general manipulations.

                              Rich's code in #12 looks fine too.
                              Last edited by Nick Cox; 04 Nov 2015, 14:45.

                              Comment

                              Working...
                              X