Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to recode data with multiple groups

    Hi Everyone,

    I am fairly new to stata and I am looking at cancer data. I have histology groupings for lung cancer. The groups all over the place so it is hard to include ranges but I included them where I could. The code works perfect but it just seems long and messy. I am looking for a way to make this code more efficient. I apologize in advance if this is an easy fix and Thanks in advance for any help!


    recode HistologicTypeICDO3 (8051/8052 =0) (8070/8076=0) (8078 =0) (8083/8084 =0) (8090 =0) (8094 =0) (8120 =0) (8123 =0) (8002 =1) (8041/8045 =1) (8015 =2) (8050 =2) (8140/8141 =2) (8143/8145 =2) (8147 =2) (8190 =2) (8201 =2) (8211 =2) (8250/8255 =2) (8260 =2) (8290 =2) (8310 =2) (8320 =2) (8323 =2) (8333 =2) (8401 =2) (8440 =2) (8470/8471 =2) (8480/8481 =2) (8490 =2) (8503 =2) (8507 =2) (8550 =2) (8570/8572 =2) (8574 =2) (8576 =2) (8012/8014 =3) (8021 =3) (8034 =3) (8082 =3) (8003/8004 =4) (8022 =4) (8030/8033 =4) (8035 =4) (8200 =4) (8240/8241 =4) (8243/8246 =4) (8249 =4) (8430 =4) (8525 =4) (8560 =4) (8562 =4) (8575 =4) (8000/8001 =5) (8010 =5) (8005 =5) (8011 =5) (8020 =5) (8046 =5) (8095 =5) (8124 =5) (8130 =5) (8146 =5) (8160 =5) (8170 =5) (8230/8231 =5) (8247 =5) (8263 =5) (8312 =5) (8340/8341 =5) (8350 =5) (8370 =5) (8441 =5) (8460 =5) (8500/8501 =5) (8510 =5) (8524 =5) (8530 =5) (8551 =5) (8580/9999 =5) ,gen (Histologygroup)
    label define Histologygroup 0 "Squamous Cell Carcinoma" 1 "Small Cell Carcinoma" 2 "Adenocarcinoma" 3 "Large Cell Carcinoma" 4 "Other Specified Carcinoma" 5 "Unspecified Carcinoma"
    label values Histologygroup Histologygroup

    Calandra

  • #2
    The permissible syntax and layout for -recode- is more versatile than you think. Note also that physical layout, particularly indentation and line continuations (///), are essential for clear code. One traditional "rule" for code is to make lines short enough to be visible without having to scroll to the right, which would likely be necessary with your code. None of this, though, affects whether your code is "efficient" for the machine, just communicative for the human.
    I'd do what you have something like this:

    Code:
    recode HistologicTypeICDO3 ///
       (8051/8052 8078 8083/8084 8090 8094 8120 8123 =0) ///
       (8002 8041/8045 =1) ///
       (8015 8050 8140/8141 8143/8145 8147 8190 8201 8211 ///
       8250/8255 8260 8290 8310 8320 8323 8333 = 2) ///
       ...  (all your otherstuff), ///
       gen(Histologygroup)
    Note also that when you are recoding and generating a new variable, you can label it "on the fly" by putting the label after the new values.
    Code:
    recode HistologicTypeICDO3  ///
       (8051/8052 8078 8083/8084 8090 8094 8120 8123 =0 "Squamous Cell Carcinoma") ///
     ...

    Comment


    • #3
      Thank you Mike! That helps so much. Especially the part about scrolling to the right,I was thinking there has to be a better way! I am exploring all of the stata resources and youtube videos to become more familiar with the tool so I will know these types of things.

      Thanks again for your help!

      Comment

      Working...
      X