Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Identify consecutive weeks around missing survey date values

    Building on a prior question (https://www.statalist.org/forums/for...rows-for-dates), I am now trying to figure out a somewhat convoluted task. I have panel survey data over 30 days and am trying to figure out how to deal with cases where respondents missed a day or two of the survey.

    My end goal is to recover as many complete weeks as possible, where week is defined simply as a 7-day period (not tied to any specific days of the week).

    So, what I want is to create a new variable called 'week' that denotes complete weeks, i.e., all consecutive 7 day stretches where there are no missing surveys.

    Here's some data for two respondents, one of whom gets interrupted by a missing survey in the first week, another in the third week:

    Code:
    clear
    input int pptid str9 survey_date float surveynum float num_survresponses str9 first_survey_date str9 last_survey_date float span_days float surv_missing
    100 "24feb2019" 1 29 "24feb2019" "22mar2019" 30 0
    100 "25feb2019" 2 29 "24feb2019" "22mar2019" 30 0
    100 "26feb2019" 3 29 "24feb2019" "22mar2019" 30 0
    100 "27feb2019" 4 29 "24feb2019" "22mar2019" 30 0
    100 "28feb2019" 5 29 "24feb2019" "22mar2019" 30 0
    100 "28feb2019" 6 29 "24feb2019" "22mar2019" 30 0
    100 "28feb2019" 7 29 "24feb2019" "22mar2019" 30 0
    100 "28feb2019" 8 29 "24feb2019" "22mar2019" 30 0
    100 "01mar2019" 9 29 "24feb2019" "22mar2019" 30 0
    100 "02mar2019" 10 29 "24feb2019" "22mar2019" 30 0
    100 "03mar2019" 11 29 "24feb2019" "22mar2019" 30 0
    100 "04mar2019" 12 29 "24feb2019" "22mar2019" 30 0
    100 "05mar2019" 13 29 "24feb2019" "22mar2019" 30 0
    100 "06mar2019" 14 29 "24feb2019" "22mar2019" 30 0
    100 "07mar2019" 15 29 "24feb2019" "22mar2019" 30 0
    100 "08mar2019" 16 29 "24feb2019" "22mar2019" 30 0
    100 "09mar2019" 17 29 "24feb2019" "22mar2019" 30 0
    100 "10mar2019" 18 29 "24feb2019" "22mar2019" 30 0
    100 "11mar2019" 19 29 "24feb2019" "22mar2019" 30 0
    100 "12mar2019" 20 29 "24feb2019" "22mar2019" 30 0
    100 "13mar2019" . 29 "24feb2019" "22mar2019" 30 1
    100 "14mar2019" 21 29 "24feb2019" "22mar2019" 30 0
    100 "15mar2019" 22 29 "24feb2019" "22mar2019" 30 0
    100 "16mar2019" 23 29 "24feb2019" "22mar2019" 30 0
    100 "17mar2019" 24 29 "24feb2019" "22mar2019" 30 0
    100 "18mar2019" 25 29 "24feb2019" "22mar2019" 30 0
    100 "19mar2019" 26 29 "24feb2019" "22mar2019" 30 0
    100 "20mar2019" 27 29 "24feb2019" "22mar2019" 30 0
    100 "21mar2019" 28 29 "24feb2019" "22mar2019" 30 0
    100 "22mar2019" 29 29 "24feb2019" "22mar2019" 30 0
    200 "29apr2019" 1 29 "29apr2019" "27may2019" 30 0
    200 "30apr2019" 2 29 "29apr2019" "27may2019" 30 0
    200 "01may2019" 3 29 "29apr2019" "27may2019" 30 0
    200 "02may2019" 4 29 "29apr2019" "27may2019" 30 0
    200 "03may2019" 5 29 "29apr2019" "27may2019" 30 0
    200 "04may2019" 6 29 "29apr2019" "27may2019" 30 0
    200 "05may2019" . 29 "29apr2019" "27may2019" 30 1
    200 "06may2019" 7 29 "29apr2019" "27may2019" 30 0
    200 "07may2019" 8 29 "29apr2019" "27may2019" 30 0
    200 "08may2019" 9 29 "29apr2019" "27may2019" 30 0
    200 "09may2019" 10 29 "29apr2019" "27may2019" 30 0
    200 "10may2019" 11 29 "29apr2019" "27may2019" 30 0
    200 "11may2019" 12 29 "29apr2019" "27may2019" 30 0
    200 "12may2019" 13 29 "29apr2019" "27may2019" 30 0
    200 "13may2019" 14 29 "29apr2019" "27may2019" 30 0
    200 "14may2019" 15 29 "29apr2019" "27may2019" 30 0
    200 "15may2019" 16 29 "29apr2019" "27may2019" 30 0
    200 "16may2019" 17 29 "29apr2019" "27may2019" 30 0
    200 "17may2019" 18 29 "29apr2019" "27may2019" 30 0
    200 "18may2019" 19 29 "29apr2019" "27may2019" 30 0
    200 "19may2019" 20 29 "29apr2019" "27may2019" 30 0
    200 "20may2019" 21 29 "29apr2019" "27may2019" 30 0
    200 "21may2019" 22 29 "29apr2019" "27may2019" 30 0
    200 "22may2019" 23 29 "29apr2019" "27may2019" 30 0
    200 "23may2019" 24 29 "29apr2019" "27may2019" 30 0
    200 "24may2019" 25 29 "29apr2019" "27may2019" 30 0
    200 "25may2019" 26 29 "29apr2019" "27may2019" 30 0
    200 "26may2019" 27 29 "29apr2019" "27may2019" 30 0
    200 "27may2019" 29 29 "29apr2019" "27may2019" 30 0
    end

    And to help clarify what I'm looking for, here's what that new 'week' variable would ideally look like for id=100:

    Code:
    clear
    input int pptid str9 survey_date float surveynum float num_survresponses str9 first_survey_date str9 last_survey_date float span_days float surv_missing float week
    100 "24feb2019" 1 29 "24feb2019" "22mar2019" 30 0 1
    100 "25feb2019" 2 29 "24feb2019" "22mar2019" 30 0 1
    100 "26feb2019" 3 29 "24feb2019" "22mar2019" 30 0 1
    100 "27feb2019" 4 29 "24feb2019" "22mar2019" 30 0 1
    100 "28feb2019" 5 29 "24feb2019" "22mar2019" 30 0 1
    100 "28feb2019" 6 29 "24feb2019" "22mar2019" 30 0 1
    100 "28feb2019" 7 29 "24feb2019" "22mar2019" 30 0 1
    100 "28feb2019" 8 29 "24feb2019" "22mar2019" 30 0 2
    100 "01mar2019" 9 29 "24feb2019" "22mar2019" 30 0 2
    100 "02mar2019" 10 29 "24feb2019" "22mar2019" 30 0 2
    100 "03mar2019" 11 29 "24feb2019" "22mar2019" 30 0 2
    100 "04mar2019" 12 29 "24feb2019" "22mar2019" 30 0 2
    100 "05mar2019" 13 29 "24feb2019" "22mar2019" 30 0 2
    100 "06mar2019" 14 29 "24feb2019" "22mar2019" 30 0 2
    100 "07mar2019" 15 29 "24feb2019" "22mar2019" 30 0 .
    100 "08mar2019" 16 29 "24feb2019" "22mar2019" 30 0 .
    100 "09mar2019" 17 29 "24feb2019" "22mar2019" 30 0 .
    100 "10mar2019" 18 29 "24feb2019" "22mar2019" 30 0 .
    100 "11mar2019" 19 29 "24feb2019" "22mar2019" 30 0 .
    100 "12mar2019" 20 29 "24feb2019" "22mar2019" 30 0 .
    100 "13mar2019" . 29 "24feb2019" "22mar2019" 30 1 .
    100 "14mar2019" 21 29 "24feb2019" "22mar2019" 30 0 3
    100 "15mar2019" 22 29 "24feb2019" "22mar2019" 30 0 3
    100 "16mar2019" 23 29 "24feb2019" "22mar2019" 30 0 3
    100 "17mar2019" 24 29 "24feb2019" "22mar2019" 30 0 3
    100 "18mar2019" 25 29 "24feb2019" "22mar2019" 30 0 3
    100 "19mar2019" 26 29 "24feb2019" "22mar2019" 30 0 3
    100 "20mar2019" 27 29 "24feb2019" "22mar2019" 30 0 .
    100 "21mar2019" 28 29 "24feb2019" "22mar2019" 30 0 .
    100 "22mar2019" 29 29 "24feb2019" "22mar2019" 30 0 .
    end

  • #2
    Your example confuses me and does not seem to fit with your description in words. Looking at id 100, the first seven observations are not on consecutive dates: the last three of them are all on 28 February. Yet according to your example of what you want the variable to look like, you show those seven observations as a "week," even though it spans only 5 days, and you start a new week with the next observation (observation 8) even though it, too, is date 28 February.

    Moving ahead to the 14 March observation for id 100, we see that there is a non-missing survey every day from then through 20 March (7 days). But you cut that "week 3" short at the 6th day, 19 March. Why?

    Can you please either fix the example data to match what you said, or clarify what you said so that the strange choices made in the example data are explained? Thanks.

    Comment

    Working...
    X