Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • missing variables between 2 time period

    Dear all,

    I hope everything going well with you.
    Recently I am working on medication prescription data and I am facing a problem with coding missing variables between first (first medication supply date) and last date (last medication supply date). Data looks similar as below:

    input byte(id aspirin2006 clopidogrel2006 heparin2006 aspirin2007 clopidogrel2007 heparin2007 aspirin2008 clopidogrel2008 heparin2008 aspirin2009 clopidogrel2009 heparin2009 aspirin2010 clopidogrel2010 heparin2010) int str12 first_date last_date
    1 . . . 1 0 0 . . . . . . 0 0 1 2007 2010
    2 0 0 1 . . . 0 1 0 . . . 0 1 0 2006 2010
    3 . . . 0 1 0 . . . . . . 1 0 0 2007 2009
    end

    I've tried the following command:

    foreach d of varlist aspirin2008-heparin2010 {
    replace `d'=0 if first_date==2007 & last_date==2010 & `d'==.
    }

    But this one is not working and I think there is something wrong (replacing missing variables before first date).
    I would really appreciate If you may help me solving the problem.


    Thank you in advance
    Sincerely
    Oyunchimeg

  • #2
    Buyada:
    your code works for me after -destring-ing -first_date-:
    Code:
    . destring first_date, replace
    first_date: all characters numeric; replaced as int
    
    . foreach d of varlist aspirin2008-heparin2010 {
      2.
    .  replace `d'=0 if first_date==2007 & last_date==2010 & `d'==.
      3.
    .  }
    (1 real change made)
    (1 real change made)
    (1 real change made)
    (1 real change made)
    (1 real change made)
    (1 real change made)
    (0 real changes made)
    (0 real changes made)
    (0 real changes made)
    
    .
    As an aside, you would probably better off with reshaping (see -help reshape-) your data from -wide- to -long- format.
    Kind regards,
    Carlo
    (Stata 18.0 SE)

    Comment


    • #3
      Dear Mr. Lazzaro,

      Thank so much for answer.
      Originally data was in -long format but I've reshaped it to -wide format to generate medication list by year. This command works but the problem is it replacing missing variables before first date.
      For example: Case 1: it works.
      foreach d of varlist aspirin2006-heparin2010 {
      2.
      . replace `d'=0 if first_date==2008 & last_date==2010 & `d'==.
      3.
      . }

      But after I run the command again by changing first date, all missing variables starting from 2007 were replaced(even if first person's first date was 2008 not 2007.) I wanted to change missing variables between first and last dates and missing variables before first date should not to be changed.

      foreach d of varlist aspirin2006-heparin2010 {
      2.
      . replace `d'=0 if first_date==2007 & last_date==2010 & `d'==.
      3.
      . }


      I would really appreciate If you may find time to help me solve the problem.

      Thank you,
      Sincerely
      Oyunchimeg

      Comment


      • #4
        Well, you may want the data in wide layout for ultimate display, but you won't be able to do this task (or at least not easily) that way. So you need to reshape it long, and fill in the missing values, and then, if you really need it wide, reshape the result back to wide:

        Code:
        clear
        input byte(id aspirin2006 clopidogrel2006 heparin2006 aspirin2007 clopidogrel2007 heparin2007 aspirin2008 clopidogrel2008 heparin2008 aspirin2009 clopidogrel2009 heparin2009 aspirin2010 clopidogrel2010 heparin2010) int str12 first_date last_date
        1 . . . 1 0 0 . . . . . . 0 0 1 2007 2010
        2 0 0 1 . . . 0 1 0 . . . 0 1 0 2006 2010
        3 . . . 0 1 0 . . . . . . 1 0 0 2007 2009
        end
        
        destring first_date last_date, replace
        
        reshape long aspirin clopidogrel heparin, i(id) j(year)
        
        foreach v of varlist aspirin clopidogrel heparin {
            replace `v' = 0 if missing(`v') & inrange(year, first_date, last_date)
        }
        
        //    SKIP THE NEXT COMMAND UNLESS YOU REALLY NEED THE
        //    RESULTS IN WIDE LAYOUT
        reshape wide
        It can't be said too often that most Stata commands work best (or only) with data in long layout. Wide layout has limited uses. So the general rule should be to set up your data in long layout and do your analyses that way. If, at the end, for a few oddball commands, or for graphing or making data displays for human eyes to read, you need the data wide, you -reshape wide- at the end. But you generally should avoid the wide layout until data management and analysis are complete.

        Comment


        • #5
          I've reshaped from -wide to -long format and it worked.

          Thank you so much professor Schechter.

          Sincerely,
          Oyunchimeg

          Comment

          Working...
          X