Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help with recoding

    Hi, I'm fairly new to stata and would need some help on this.

    I'm trying to categorise a string variable into a binary outcome:

    The variable is Procedure code with 80 different observation groups e.g. A112,A113,B113
    What I'm intending to do is to separate them into a binary outcome e.g. A112,B113 = 0 (not approved procedure) A113,C245 etc = 1 (approved procedure)

    Have gone through the stata help and online resources but can't seem to find a way to so . Could anyone provide some help or direct me to the right resources?

    Many thanks

  • #2
    Well the obvious first question is how do you know which of the procedure codes are approved and which aren't?

    Other obvious question: are all of the codes a single letter followed by 3 digits?

    It would be most helpful if you showed example data, using the -dataex- command so we can be sure exactly how your data are organized. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    When asking for help with code, always show example data. When showing example data, always use -dataex-.

    Comment


    • #3
      Thanks for your response Clyde

      Apologies as I was quite vague but to be specific, approved = patients with urostomy, unapproved = no urostomy. I have a reference book which says which codes = urostomy and which codes dont

      And yes, all codes do have a single letter followed by 3 digits

      Here's an example data

      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input str4 opcs
      "C073"
      "C341"
      "C083"
      "C093"
      "C103"
      "C334"
      "C333"
      "C071"
      "C095"
      "C092"
      end

      Comment


      • #4

        Because you describe lists of both approved and unapproved procedures, I'm a little unsure, but I'm going to start by assuming that if a code does not appear in the approved list, it must be unapproved. That makes the list of unapproved codes superfluous. If that's wrong, you'll need to explain more.

        Code:
        // Your list will be much longer.
        local approvedlist = "A113 C245"
        gen approved = strpos("`approvedlist'", opcs) > 0

        Comment


        • #5
          Thanks Mike

          I'll start from the beginning as I might have confused everyone

          I was given a dataset to work with and the variable I'm struggling with is called opcs
          It's a string variable with a range of observations i.e. the codes C113 C112 etc that has no numerical or alphabetical order
          What i'm trying to do is define a new variable called urostomy where 0=no urostomy, 1=urostomy and I have a reference which tells me which opcs codes represent a urostomy and which codes dont

          I have tried to do this:

          gen urostomy=opcs
          recode urostomy ("C113 C112"=0) ("C124 C125 C071"=1)

          However, this did not work hence here I am to look for some help

          Hope this is okay thanks

          Comment


          • #6
            jim templeton could you explain why the variable approved in #4 is not the urostomy variable you'd like to create?

            Comment


            • #7
              Hemanshu Kumar Hi Aplologies all, there was a slight mistake when I used the code provided by Mike, It works perfectly fine now, thanks a lot

              Just for my learning, may I know why did the method I used initially did not work?

              Comment


              • #8
                note that the help/manual for -recode- are quite clear: it applies to numeric variables only (i.e., it cannot be used for string variables); e.g., " recode changes the values of numeric variables according to the rules specified."

                Comment


                • #9
                  Rich Goldstein Could you direct me to the resources to learn how to recode string variables?

                  Comment


                  • #10
                    what do you mean by "recode string variables"?

                    Comment


                    • #11
                      Like trying to do what I was struggling with:

                      Defining a new variable based on the observations from a string variable and getting the outcome to be binary

                      Comment


                      • #12
                        Originally posted by jim templeton View Post
                        Like trying to do what I was struggling with:

                        Defining a new variable based on the observations from a string variable and getting the outcome to be binary
                        If the command recode is only for numeric data, you cannot force it to take string data. Other than this command, there are many ways to achieve the goal. Thread #4 already addressed one. Just in case, here is a Stata code example:

                        Code:
                        clear
                        input str5 y
                        "A123"
                        "A456"
                        "A789"
                        end
                        
                        local approvedlist = "A123 A789"
                        gen approved = strpos("`approvedlist'", y) > 0
                        
                        list
                        Results:

                        Code:
                             +-----------------+
                             |    y   approved |
                             |-----------------|
                          1. | A123          1 |
                          2. | A456          0 |
                          3. | A789          1 |
                             +-----------------+
                        Isn't that "Defining a new variable based on the observations from a string variable and getting the outcome to be binary"?

                        Comment


                        • #13
                          Thanks all

                          Comment

                          Working...
                          X