Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to get mean for cumulative lag periods in case crossover design?

    Hello,

    I'm having trouble generating mean variable for cumulative lag periods (2-3 days, 2-4 days, 2-5 days etc.). Below data example shows case (lag days 2 and 3) and control days as well as corresponding SO2 and apparent temp values. I'd like to get mean values for lag days 2 and 3 for SO2 and apparent temp so that each subject has one mean value for case days (lag days 2 and 3) and three mean values for control days (corresponding days of lag days 2 and 3).

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float id double ndate byte case float(so2_pred_ppb apptemp)
    13 18282 0  24.16576  -13.62427
    13 18283 0  21.32349  -14.36305
    13 18289 0  22.08568  -9.079194
    13 18290 0  19.47187 -12.661038
    13 18296 1 22.098206 -14.340417
    13 18297 1  23.73793 -13.093917
    13 18303 0 17.472273 -13.639982
    13 18304 0  32.25117 -14.520927
    14 18284 0  27.81144 -14.220161
    14 18285 0 28.787304  -14.06131
    14 18291 0  20.79549 -10.927979
    14 18292 0 25.071177 -12.635675
    14 18298 1  9.025553  -12.27794
    14 18299 1  15.59949 -12.816295
    14 18305 0  33.91532  -14.02643
    14 18306 0 19.734413 -14.194155
    end
    format %td ndate
    I've searched around a lot but haven't really found a solution to it. Maybe my search wording was not good enough. I'd really appreciate it if someone could help me with this.

    TIA,
    Temuulen
    Last edited by Temuulen Enebish; 03 May 2018, 04:54.

  • #2
    Your post has been up for almost 6 hours now and hasn't gotten a response. While I can speak only for myself, I'm not responding because I don't understand what you want. You have used -dataex- to show example data, for which thanks. But I don't know what you mean by things like "mean values for lag days 2 and 3." If you could explain that, or, better still, give an example of how you would calculate it by hand from the example data you show, I think you will get help quickly.

    Comment


    • #3
      Originally posted by Clyde Schechter View Post
      Your post has been up for almost 6 hours now and hasn't gotten a response. While I can speak only for myself, I'm not responding because I don't understand what you want. You have used -dataex- to show example data, for which thanks. But I don't know what you mean by things like "mean values for lag days 2 and 3." If you could explain that, or, better still, give an example of how you would calculate it by hand from the example data you show, I think you will get help quickly.
      Hi Clyde, thanks very much for letting me know.

      In case-crossover, we use subjects as his/her own control and compare exposures on event days (case days) with referent days (control) to look at transient effects. In my example provided, my event date was birth date and I'm using lag days 2 and 3 which are 2 and 3 days prior to birth date, respectively. I used asymmetric bidirectional method to choose my control days which were 14 days, 7 days prior and 7 days after the event. Ideally, my output after would look like below. I'm using the date for lag day 3 as the new date variable for mean so2 and apparent temp. Each mean is calculated for so2 and apparent temp for subsequent days.

      Code:
      clear
      input float id double ndate byte case float(mean_so2_pred_ppb mean_apptemp)
      13 18282 0  22.74  -13.99
      13 18289 0  20.78  -10.87
      13 18296 1  22.92  -13.72
      13 18303 0  24.86  -14.08
      end
      format %td ndate
      I really hope this makes it somewhat clearer. My apologies for incomplete question.

      Comment


      • #4
        I'm sorry, but this is no clearer to me. In the results you post in #3, let's look at the first observation. Where do the 22.74 and -13.99 come from? How do they relate to the data you show in #1? This observation in #2 is dated 1882 (20jan2010), id 13. There is an observation with this id and date in #1 as well, but it contains different numbers for so2 and apptemp. I gather that you intend mean_so2_pred_ppb and mean_apptemp to be means of some values in the #1 data set, perhaps the averages of these measure 2 and 3 days before the value of ndate: but there are no observations with such dates! So where do these numbers come from?

        Comment


        • #5
          Originally posted by Clyde Schechter View Post
          I'm sorry, but this is no clearer to me. In the results you post in #3, let's look at the first observation. Where do the 22.74 and -13.99 come from? How do they relate to the data you show in #1? This observation in #2 is dated 1882 (20jan2010), id 13. There is an observation with this id and date in #1 as well, but it contains different numbers for so2 and apptemp. I gather that you intend mean_so2_pred_ppb and mean_apptemp to be means of some values in the #1 data set, perhaps the averages of these measure 2 and 3 days before the value of ndate: but there are no observations with such dates! So where do these numbers come from?
          I want to get mean values of SO2 and temp for lag days 2 and 3 of both case and control periods. The number 22.74 is the mean of 24.16576 and 21.32349 (first 2 observations for so2 in my example). This is the mean value of SO2 for my first set of control period for subject #13 (each subject has 1 cumulative case period and 3 cumulative control periods), in this case 18282 and 18283. In the same vein, -13.99 is the mean apparent temp of -13.62427 and -14.36305.
          In the output, my use of date for lag day 3 might have been confusing. That could just as well be a string variable indicating it's for a cumulative period of 2-3 lag days. The date will not be used for analysis after this step.

          Comment


          • #6
            I'm still not sure I get this. I think that your use of the term "lag days 2 and 3" is confusing me. To me that means, for an observation of any given date, the observations dated 2 and 3 days earlier than the current observations. But you don't have any observations like that anywhere in your example data.

            It appears, instead, that you are simply grouping your data into pairs of days and then reducing to a single observation that averages the results for the data within pairs. If that's what you want, this will do it:

            Code:
            //    MARK PAIRS OF DATES
            by id (ndate), sort: gen pair = ceil(_n/2)
            
            //    VERIFY CONSISTENCY OF CASE STATUS WITHIN PAIRS
            by id pair (case), sort: assert case[1] == case[_N]
            
            //    REDUCE TO ONE OBSERVATION PER PAIR
            collapse (first) ndate case (mean) so2_pred_ppb apptemp, by(id pair)
            rename (so2_pred_ppb apptemp) mean_=
            The above code produces the results you show in #3 when applied to the example data shown in #1.

            I hope this is what you are looking for.

            Comment


            • #7
              Thanks very much! Your code worked perfectly. I'm sorry for not being very clear. I didn't show my original current observations (birthdates) in my example. I had derived these lag days 2 and 3 using my original dataset in order assign exposure estimates for these dates. Then I was stuck on how to get mean values for the cumulative lag period. I should have included my original dataset and described how I got above example dataset. I'll make sure to include those in my next posts.

              Thanks again,
              Temuulen

              Comment

              Working...
              X