Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Subtracting by rows

    Dear all,

    The problem might be very simple for many of you, not for me !!. Here my data looks like below. I need to create a new variable with the change score where the variable 'kimsob' changes in 'pre' from "baseline" to "follow up". Essentially it will be a score [followup 50 - baseline 43=7] :

    Code:
          id         pre     treat   kimsob  
     40.    1    Baseline   Control       43  
     41.    1    Baseline   Control       43  
     42.    1    Baseline   Control       43  
     43.    1    Baseline   Control       43  
     44.    1    Baseline   Control       43  
     45.    1    Baseline   Control       43  
     46.    1    Baseline   Control       43  
     47.    1    Baseline   Control       43  
     48.    1    Baseline   Control       43  
     49.    1   Follow up   Control       50  
     50.    1   Follow up   Control       50  
     51.    1   Follow up   Control       50  
     52.    1   Follow up   Control       50  
     53.    1   Follow up   Control       50  
     54.    1   Follow up   Control       50

    I always appreciate your help. Many thanks.

    All the best,
    Last edited by Roman Mostazir; 02 Mar 2015, 20:54.
    Roman

  • #2
    Hi Roman,

    I may not be completely understanding what you are looking for. I have a couple clarifying questions before I make any suggestions.

    In your sample data it looks like the same individual has multiple measurements at baseline and at follow up. Is this how I should understand it? If yes, are these scores for different variables or is it the same variable measured multiple times during baseline and during follow up?

    Patrick

    Comment


    • #3
      Hi Patrick, Thanks for replying. Yes you are right. Same individuals repeatedly measured on a particular main outcome variable which is not shown here and is measured repeatedly 10 times a day, 6 days a week for both baseline and follow up. But 'kimsob' is measured only once in baseline and once in followup. I can create a change score via tagging rout (perhaps, haven't tried yet), but wondering there might be a better solution that will subtract the baseline 'kimsob' from followup 'kimsob', where the change occurs by id. Hope it clarifies.
      Roman

      Comment


      • #4
        Code:
        clear
        input id   pre     treat   kimsob  
         1   2010   1       43  
         1   2010   1       43  
         1   2010   1       43  
         1   2010   1       43  
         1   2010   1       43  
         1   2010   1       43  
         1   2010   1       43  
         1   2010   1       43  
         1   2010   1       43  
         1   2015   1       50  
         1   2015   1       50  
         1   2015   1       50  
         1   2015   1       50  
         1   2015   1       50  
         1   2015   1       50
         2   2010   1       83  
         2   2010   1       83  
         2   2010   1       83  
         2   2010   1       83  
         2   2010   1       83  
         2   2010   1       83  
         2   2010   1       83  
         2   2010   1       83  
         2   2010   1       83  
         2   2015   1       60  
         2   2015   1       60  
         2   2015   1       60  
         2   2015   1       60  
         2   2015   1       60  
         2   2015   1       60
         end
         
         list
          collapse (mean) kimsob, by(id pre treat)
         list
         sort id pre
         by id: generate dif=kimsob[2]-kimsob[1]
         list
         by id: keep if _n==1
         drop pre kimsob
         list
        Obviously list statements are only to illustrate what's going on after each step.

        For these two patients produces:
        Code:
             +------------------+
             | id   treat   dif |
             |------------------|
          1. |  1       1     7 |
          2. |  2       1   -23 |
             +------------------+
        Weakly assumed is that each patient id was either a treatment or control, not both. Not difficult to modify for that case as well if needed.

        Best regards, Sergiy Radyakin

        Comment


        • #5
          Thanks for your reply Sergiy. Unfortunately the solution I am looking for should be without losing any rows because each row contains information about the main outcome. The collapse command will force to remove my working dataset from memory and also will reduce number of rows, therefore, not an ideal solution.
          Roman

          Comment


          • #6
            Why isn't this

            Code:
             
            bysort id (pre) : gen change = kimsob[_N] - kimsob[1]

            Comment


            • #7
              or this:
              Code:
              bysort id: gen change=cond(kinsob != kinsob[_n-1], kinsob-kinsob[_n-1], .)
              With this you create a variable containing missing for all the records where kinsob has not changed, and the difference when it has changed.
              If your prefer it to contain 0 instead of missing when kinsob did not change, you specify "0" instead of "." in the cond() function.
              See help cond()

              you can also type :
              Code:
              gen change = 0   // (or . if you prefer)
              bysort id: replace change = kinsob-kinsob[_n-1] if kinsob != kinsob[_n-1]
              Greetings, Klaudia

              Comment


              • #8
                Dear Nick and Klaudia,

                Many thanks to both of you for your input. Both worked perfect and exactly what I was looking for. My gratitude.

                All the best,
                Roman

                Comment


                • #9
                  I have similar question: I wanted to subtract by rows (vertically) but my observation are changing and are so many, may be 4555000


                  country time. quarter var1. var2
                  US. 2000. 1. sales taxes. 560
                  US. 2000. 1. custom taxes. 600
                  US. 2000. 1. VAT. . 600



                  I wanted to gen new variable that substract sales taxes - custom taxes.In between var I have several other observations but I wanted to conditional on these two. Is there any help on this please? Thank you!










                  Comment


                  • #10
                    Use dataex to provide a data example (FAQ Advice #12). It matters whether "var1" is a string variable or a numeric variable with value labels.

                    Comment


                    • #11
                      Andrew Musau gives excellent advice. It looks to me as if you need a reshape wide and then you are able to subtract variables, not observations (rows in your terminology).

                      Comment

                      Working...
                      X