Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Splitting observations or creating new observations in STATA

    Hello again!
    I am undertaking an analysis of morbidity in infants. The infants were followed monthly for 1 year. Each observation corresponds to one visit. At each visit data on up to two episodes of illness were collected. I need to split these observations so that I have one observation per episode of illness per infant. Can anyone advise how best to do this? This only applies to about 31 observations, so numbers are very small.

    Thanks and best wishes,
    Maureen.

  • #2
    You don't tell us anything about the variables in the data set, but under reasonable assumptions, the -reshape- command will probably solve your problem. Without knowing more about the variables, I can't give you more specifics, but the on-line help file for -reshape- is pretty clear.

    Comment


    • #3
      Thanks for the pointer. I will take a look at reshape and get back to you if I need more help.

      Comment


      • #4
        Oh and just to clarify, it is 31 observations out of a total of over 270k, so not all observations need to be reshaped...

        Comment


        • #5
          Oh and just to clarify, it is 31 observations out of a total of over 270k, so not all observations need to be reshaped...
          Well, you might start by splitting your data into two sets: one with just the 31, and the other with the rest (with the second illness variables deleted as they should all have missing values anyway), -reshape-ing just the 31, and then -append-ing the results to the rest. -reshape- can be slow in large data sets, so isolating the small number of observations that need it may save you a noticeable amount of execution time.

          Comment


          • #6
            It sounds like you might want expand; lets's suppose you have two illness variables, illness1 and illness2, and your 31 observations are just those that have both variables not missing:

            Code:
            gen byte has2=!mi(illness1)&!mi(illness2)
            expand 2 if has2, gen(new)
            replace illness1=illness2 if new
            drop illness2 new
            Hope this helps,
            Jeph

            Comment


            • #7
              Jeph's approach would be better than mine if the information about the illnesses consists of only one or a handful of variables, as his code illustrates. His approach gets complicated if there are many variables about each illness, and then the -reshape- approach would be simpler.

              Comment


              • #8
                OK, I will give these a go and get back to you if there are any problems.
                Thanks!

                Comment


                • #9
                  Maureen said there were "up to two episodes" of illness were collected, so I took this literally. However, Clyde is correct that if you have many more than two, my approach gets very cumbersome.

                  Comment


                  • #10
                    Hi,
                    Sorry for not responding sooner, moving house!!
                    Anyway Jeph I just wanted to let you know that the code worked beautifully.

                    thanks ever so much for your help,
                    Maureen.

                    Comment

                    Working...
                    X