Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating a string variable based on values of different variables

    Dear statalisters,
    Let's say I have 5 different variables that can take values ranging from 1 to 10.
    Var1 Var2 Var3 Var4 Var5
    1 0 3 1 0
    1 1 1 1 0
    0 0 1 1 5
    0 1 1 1 5
    0 0 1 2 4
    0 0 1 3 3
    2 1 2 4 2
    3 1 0 5 1

    Based on the values of these variables I would like to create a string variable that summarises the information in the given column.
    For instance: For the first row this variable will take the following value "1*Var1 + 3*Var3 + 1*Var4". For the second row it will be: "1*Var1 + 1*Var2 + 1*Var3 + 1*Var4".
    You can think of these variables as drugs and corresponding numeric values as doses that each patient has taken.

    Could you please help me with this.





  • #2
    Erdem:
    why not simply using -rename-?
    Code:
    . set obs 1
    number of observations (_N) was 0, now 1
    
    . g Var_1=1
    
    . g Var_2=1
    
    .  foreach var of varlist Var_* {
      2. rename `var' Dosage_`var'
      3.  }
    
    . list
    
         +---------------------+
         | Dosage~1   Dosage~2 |
         |---------------------|
      1. |        1          1 |
         +---------------------+
    
    .
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Actually what I want to do is generating a string that contains all the information in a single variable -- because I will use this information later in sequence analysis. So let' assume the first row is the drugs and doses of the drugs that a patient has taken in the first month, then second and third rows would represent the drugs and their doses given to that patient in the subsequent months. For the illustration purposes, I need to summarize this information in a string variable. Sorry for not being clear earlier, hope it is clearer now

      Comment


      • #4
        Erdem:
        you may want to try:
        Code:
        . g summary_string="1*Var1 + 3*Var3 + 1*Var4" in 1
        
        . list
        
             +--------------------------+
             |           summary_string |
             |--------------------------|
          1. | 1*Var1 + 3*Var3 + 1*Var4 |
             +--------------------------+
        
        .
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          I'm not sure why you want to do this, but the below does what I think you want:

          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input byte(var1 var2 var3 var4 var5)
          1 0 3 1 0
          1 1 1 1 0
          0 0 1 1 5
          0 1 1 1 5
          0 0 1 2 4
          0 0 1 3 3
          2 1 2 4 2
          3 1 0 5 1
          end
          
          gen summary = ""
          forval i =1/5{
              replace summary = summary + "`plus'" + string(var`i')+"*var`i'" if var`i'!=0
              if `i'<5{
                  local plus " + "
              }
              else{
                  local plus ""
              }
          }
          
          
                 var1   var2   var3   var4   var5                                      summary  
            1.      1      0      3      1      0                     1*var1 + 3*var3 + 1*var4  
            2.      1      1      1      1      0            1*var1 + 1*var2 + 1*var3 + 1*var4  
            3.      0      0      1      1      5                   + 1*var3 + 1*var4 + 5*var5  
            4.      0      1      1      1      5          + 1*var2 + 1*var3 + 1*var4 + 5*var5  
            5.      0      0      1      2      4                   + 1*var3 + 2*var4 + 4*var5  
            6.      0      0      1      3      3                   + 1*var3 + 3*var4 + 3*var5  
            7.      2      1      2      4      2   2*var1 + 1*var2 + 2*var3 + 4*var4 + 2*var5  
            8.      3      1      0      5      1            3*var1 + 1*var2 + 5*var4 + 1*var5

          Comment


          • #6
            Chris beat me to the post, but here is a slightly tighter loop that also doesn't depend on the actual variable names - thinking that Var1 through Var5 may have been simplified from more complicated names. This just depends on their being one after the other in the data, but even that is not the case, "Var1-Var5" could be replaced with a list of individual variable names.
            Code:
            generate summary = ""
            foreach var of varlist Var1-Var5 {
                quietly replace summary = summary + " + " if `var'!=0 & summary!=""
                quietly replace summary = summary + strofreal(`var') + "*`var'" if `var'!=0
            }

            Comment


            • #7
              Dear Chris and William, Thank you very much for your replies, both of your solutions achieve what I want. Chris, while this summary data is not relevant for the analysis, in the appendix, I need to provide the full list of drugs used for treating the patients.
              Thanks a lot once more!

              Comment

              Working...
              X