Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Combining several string variables if it's the value is not empty

    Dear all,

    I have 4 string variables= medicine1, medicine2, medicine3, and medicine4. Each of them contains a name of a medicine (string). I would like to create a new variable called medicine_all which combines the name of the medicines from the 4 variables. I used:
    egen medicine_all = concat (medicine1-medicine4), decode p(",")

    In general, it gives the result that I want. But it also gives me a value of ",,,," when all the medicines are empty. Is there any way to ask Stata to return me "blank" or "missing" if all the medicines variables are missing?

    Thank you in advance.

    Vini

  • #2

    Code:
    gen medicine_all = medicine1
    
    forval j = 2/4 {
        replace medicine_all = cond(medicine_all == "", medicine_`j', "," + medicine_`j') if medicine_`j' != ""
    }

    Comment


    • #3
      Hi Nick,

      Thanks a lot for your response.

      Small question,
      The actual names of the variables are
      med_i_first
      med_ii_first
      med_iii_first, up to 12
      med_i_nonncd_first
      med_ii_nonncd_first
      med_iii_nonncd_first, up to 10

      I tried to adapt to syntax into:
      forval j = 2/12 {
      replace medicine_all = cond(medicine_all == "", medicine_`j'_first, "," + medicine_`j'_first) if medicine_`j'_first != ""
      }

      And it didn't work. Could you please tell me how should I adapt the syntax into the variable name?

      Comment


      • #4
        The problem is you have roman numerals where arabic are expected. For example the code is looking for med_2_first, not med_i_first.

        Comment


        • #5
          Assuming that the infixes run i ii iii iv v vi vii viii ix x xi xii you could do this:

          Code:
          tokenize "i ii iii iv v vi vii viii ix x xi xii"
          gen medicine_all = med_`1'_first  
          forval j = 2/12 {    
               replace medicine_all = cond(medicine_all == "", med_``j''_first, "," + med_``j''_first) if med_``j''_first != ""
          }
          Alternatively,

          Code:
          gen medicine_all = med_i_first  
          foreach j in ii iii iv v vi vii viii ix x xi xii {
               replace medicine_all = cond(medicine_all == "", med_`j'_first, "," + med_`j'_first) if med_`j'_first != ""
          }


          Last edited by Nick Cox; 27 Aug 2018, 17:48.

          Comment


          • #6
            Thank you Dave and Nick.

            So, I tried this code:
            gen medicine_all = med_i_first foreach j in ii iii iv v vi vii viii ix x xi xii { replace medicine_all = cond(medicine_all == "", med_`j'_first, "," + med_`j'_first) if med_`j'_first != "" } It worked, but it returned all blank.

            So, I rename the variables (changing i,ii,etc into 1,2, etc), and ran this code:

            gen medicine_adm = med_1_first
            forval j = 2/12 {
            replace medicine_adm = cond(medicine_adm == "", med_`j'_first, "," + med_`j'_first) if med_`j'_first != ""
            }

            It worked, but again, it returned all blank.

            I changed the type of the variables into string (as some of the empty variables are not stored as string originally). Could this be the cause?

            Also, how can I add the med_1_nonncd_first, (as this is a different name from "med_1_first")?

            Thanks for your help..

            Comment


            • #7
              I can't add usefully to this without seeing a data example. Please re-read the FAQ Advice (and also use CODE delimiters).

              Comment


              • #8
                Hi Nick, so sorry for the confusion.. I'm trying to attach the .dta file, but I haven't found the way to do it.. I used the paper clip icon, but it said "This is not a valid image file. medicine.dta". I search in the help, they mentioned about sending a google drive link or change the extension of the file in to .jpg. I chose the second option. And it gave me "Image upload failed because the extension of this file did not match the content. medication list.jpg". Could you please tell me the right way to attach a .dta file?

                Thank you in advance.

                Comment


                • #9
                  https://www.statalist.org/forums/help#stata applies (and explains why we don't want to see .dta attachments).

                  Comment


                  • #10
                    Thanks, Nick!!

                    Here you go..
                    Code:
                    * Example generated by -dataex-. To install: ssc install dataex
                    clear
                    input str10 patientid str30(med_1_first med_2_first med_3_first med_4_first med_5_first) str1(med_6_first med_7_first) str27 med_1_nonncd_first str1 med_2_nonncd_first
                    "GAT146/18" "ENALAPRIL maleate"       "ACETYLSALICYLIC acid (aspirin)" "ATORVASTATINE "                 ""                               ""                  "." "." ""                            "."
                    "GAT147/18" "ENALAPRIL maleate"       ""                               ""                               ""                               ""                  "." "." ""                            "."
                    "GAT148/18" "GLIBENCLAMIDE"           "HYDROCHLOROTHIAZIDE"            ""                               ""                               ""                  "." "." "AMOXICILLIN/CLAVULANIC acid" "."
                    "GAT149/18" "SALBUTAMOL sulfate"      "BECLOMETASONE dipropionate"     ""                               ""                               ""                  "." "." ""                            "."
                    "GAT150/18" "METFORMIN hydrochloride" "GLIBENCLAMIDE"                  "ENALAPRIL maleate"              ""                               ""                  "." "." ""                            "."
                    "GAT151/18" "HYDROCHLOROTHIAZIDE"     ""                               ""                               ""                               ""                  "." "." ""                            "."
                    "GAT152/18" "METFORMIN hydrochloride" "GLIBENCLAMIDE"                  "ATORVASTATINE "                 "ACETYLSALICYLIC acid (aspirin)" "ENALAPRIL maleate" "." "." ""                            "."
                    "GAT153/18" "SALBUTAMOL sulfate"      "BECLOMETASONE dipropionate"     "METFORMIN hydrochloride"        "GLIBENCLAMIDE"                  ""                  "." "." ""                            "."
                    "GAT154/18" "METFORMIN hydrochloride" "ATORVASTATINE "                 "ENALAPRIL maleate"              "GLICLAZIDE "                    ""                  "." "." ""                            "."
                    "GAT155/18" "GLIBENCLAMIDE"           "METFORMIN hydrochloride"        "ATORVASTATINE "                 ""                               ""                  "." "." ""                            "."
                    "GAT156/18" "HYDROCHLOROTHIAZIDE"     ""                               ""                               ""                               ""                  "." "." ""                            "."
                    "GAT157/18" "HYDROCHLOROTHIAZIDE"     ""                               ""                               ""                               ""                  "." "." ""                            "."
                    "GAT158/18" "ENALAPRIL maleate"       "METFORMIN hydrochloride"        "ATORVASTATINE "                 ""                               ""                  "." "." ""                            "."
                    "GAT159/18" "CARBAMAZEPINE_CNCD_CNCD" "SODIUM VALPROATE"               ""                               ""                               ""                  "." "." ""                            "."
                    "GAT160/18" "METFORMIN hydrochloride" "GLIBENCLAMIDE"                  "ENALAPRIL maleate"              ""                               ""                  "." "." ""                            "."
                    "GAT162/18" "ENALAPRIL maleate"       "ACETYLSALICYLIC acid (aspirin)" "ATORVASTATINE "                 ""                               ""                  "." "." ""                            "."
                    "GAT163/18" "HYDROCHLOROTHIAZIDE"     "ATORVASTATINE "                 "ACETYLSALICYLIC acid (aspirin)" "ATENOLOL"                       ""                  "." "." ""                            "."
                    "GAT164/18" "METFORMIN hydrochloride" "GLIBENCLAMIDE"                  "ATORVASTATINE "                 "ACETYLSALICYLIC acid (aspirin)" ""                  "." "." ""                            "."
                    "GAT165/18" "HYDROCHLOROTHIAZIDE"     "AMLODIPINE"                     "ATENOLOL"                       ""                               ""                  "." "." ""                            "."
                    "GAT166/18" "PHENOBARBITAL sodium"    "CARBAMAZEPINE_CNCD_CNCD"        "SODIUM VALPROATE"               ""                               ""                  "." "." ""                            "."
                    end
                    Hope it works..

                    Comment


                    • #11
                      Thanks for the example. With an example in front of me, I see a silly bug of mine in #2 and #5. Sorry about that. Also, you have here instances of "." which should presumably be ignored.


                      Code:
                      gen med_all = med_1_first
                      
                      forval j = 2/7 {
                          replace med_all = cond(med_all == "", med_`j'_first, med_all + "," + med_`j'_first) if !inlist(med_`j'_first, "", ".")
                      }
                      Last edited by Nick Cox; 28 Aug 2018, 06:49.

                      Comment


                      • #12
                        Hi Nick,

                        It works very well!! And I also learn where the mistake was.. Thanks a lot for this!

                        Now, I still need to also add the medicines under the "med_j_nonncd_first".

                        I tried:
                        Code:
                        gen med_adm = med_1_first 
                        forval j = 2/12 { 
                            replace med_adm = cond(med_adm == "", med_`j'_first + med_`j'_nonncd_first, med_adm + "," + med_`j'_first + med_`j'_nonncd_first) if !inlist(med_`j'_first, "", ".") | !inlist(med_`j'_nonncd_first,"",".")
                        }
                        But it gave me error "med_6_nonncd_first not found".

                        Could you please help me how to add the values from med_j_nonncd_first into the combined one?

                        Thanks!

                        Comment


                        • #13
                          So, don't try to loop over variables you don't have. To get better advice we need to know what variables you do have.

                          Code:
                          ds med*first
                          Also, what order do you want them added? I have to guess to suggest

                          Code:
                          gen med_adm = med_1_first 
                          forval j = 2/12 { 
                              replace med_adm = cond(med_adm == "", med_`j'_first, med_adm + "," + med_`j'_first) if !inlist(med_`j'_first, "", ".") 
                              capture replace med_adm = cond(med_adm == "", med_`j'_nonncd_first, med_adm + "," + med_`j'_nonncd_first) if !inlist(med_`j'_nonncd_first, "", ".") 
                          }

                          Comment


                          • #14
                            So, don't try to loop over variables you don't have. To get better advice we need to know what variables you do have.
                            Ah, OK.

                            So, I have med_j_first from 1 to 12 and med_j_nonncd_first from 1 to 5. I tried to include all when I sent the dataex earlier, but it gives me "input statement exceeds line size limit. Try specifying fewer variables".

                            Anyway, for each patient, we need to check the values from med_j_first and then med_j_nonncd_first. A patient shouldn't have a med_1_nonncd_first value without having med_1_first value. So, I tweak the code into:
                            Code:
                            gen med_adm = med_1_first 
                            forval j = 2/12 { 
                                replace med_adm = cond(med_adm == "", med_`j'_first, med_adm + "," + med_`j'_first) if !inlist(med_`j'_first, "", ".") 
                                capture replace med_adm = cond(med_adm != "", med_adm + "," + med_`j'_nonncd_first,) if !inlist(med_`j'_nonncd_first, "", ".") 
                            }
                            But it still doesn't take the value from the med_j_nonncd_first. I'm thinking, could it be because we put the j as 2/12, while the medicines are under "med_1_nonncd_first"?

                            I tried by changing it into:
                            Code:
                            gen med_adm = med_1_first 
                            forval j = 1/12 { 
                                replace med_adm = cond(med_adm == "", med_`j'_first, med_adm + "," + med_`j'_first) if !inlist(med_`j'_first, "", ".") 
                                capture replace med_adm = cond(med_adm != "", med_adm + "," + med_`j'_nonncd_first,) if !inlist(med_`j'_nonncd_first, "", ".") 
                            }
                            But it duplicated the values of "med_j_first", and still not picking up the values from "med_j_nonncd_first"

                            Ummm, I hope I am not confusing you even more, Nick..

                            Comment


                            • #15
                              Indeed. My code misses out med_1_nonncd_first, but your fix to the capture replace line isn't legal or correct. Here is another go.

                              Code:
                              gen med_adm = "" 
                              forval j = 1/12 { 
                                  replace med_adm = cond(med_adm == "", med_`j'_first, med_adm + "," + med_`j'_first) if !inlist(med_`j'_first, "", ".") 
                                  capture replace med_adm = cond(med_adm == "", med_`j'_nonncd_first, med_adm + "," + med_`j'_nonncd_first) if !inlist(med_`j'_nonncd_first, "", ".") 
                              }

                              Comment

                              Working...
                              X