Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • convert 'n' rows of a variable to 'n' columns

    Hi,

    I would appreciate if I could get help with creating new variables from rows of an existing variable. I am doing a survival analysis (recurrent events). I have nearly 45000 subjects who have intervals varying from 1 to 17. For each interval, I need to get the cumulative dose of the medication. The number of individual medication varies from 1 to 7. As I need one row for each interval per subject, I need to long format for each interval to wide format, meaning converting 'n' rows of variable drug code to 'n' columns of drug code (code1, code2, ... coden). I am unsuccessful in creating loop for each interval per subject id.

    I am giving an example of dataset.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(id interval code cdose)
    1 1 10  100
    1 1 15  150
    1 1 50  200
    1 2 10 1000
    1 2 15 1000
    2 1 10  500
    2 1 30  250
    2 2 25  245
    2 2 30  250
    2 3 25  250
    2 3 30  500
    3 1 10  150
    3 1 20  200
    3 2 10  150
    4 1 15  150
    4 2 25  300
    4 2 30  300
    end

    As the intervals subject varies from subject to subject and also the number of medications used varies from interval to interval, I have been struggling to get single row for each interval.

    I very much appreciate help.

    With warm regards,
    Sateesh

  • #2
    Can you show me the exact code for loop you tried? What error code was returned?

    Also: what result do you want? So for example, for id 1 and interval 1, do you want the result to be something like "100+150+200", and for id 1 interval 2, are you looking for "1000+1000"?

    Comment


    • #3
      The Stata command for this general sort of operation is reshape. In other contexts like SQL, transpose is a synonym. I am not familiar with the stjoin and stsplit commands, but I think the former may also do some of what you want.
      Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

      When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

      Comment


      • #4
        Dear Jared,
        I was hoping the result like (as the maximum number of medications in my study are 7)

        id interval code1 cdose1 code2 cdose2 code3 cdose3 code4 cdose4 code5 cdose5 code6 cdose6 code7 cdose7

        Thanks

        Comment


        • #5
          Dear Welwen,
          Many thanks.
          Sateesh

          Comment


          • #6
            Code:
            bys id interval (code): gen j = _n
            reshape wide cdose code,i(id interval) j(j)

            Comment


            • #7
              Many thanks

              Comment

              Working...
              X