Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Dropping observations if missing certain dates

    I have incomplete data and would like to drop any observations for which I do not have data in a particular variable during a given time period. My best guess of how to do this would be by generating a tag with bysort and then dropping everything without that tag. For example,

    Code:
    clear all
    ssc install dataex
    input long id float year float month float day float var1
    1 2017 7 1 125
    1 2017 8 1 200
    1 2017 9 1 108
    1 2018 4 1 20
    1 2018 5 1 50
    1 2018 7 1 73
    1 2018 8 1 18
    1 2018 9 1 20
    2 2018 7 1 32
    2 2018 8 1 29
    2 2018 9 1 18
    2 2018 4 1 103
    2 2018 5 1 24
    2 2018 7 1 .
    2 2018 8 1 .
    2 2018 9 1 .
    3 2017 7 1 20
    3 2017 8 1 .
    3 2017 9 1 .
    3 2018 7 1 73
    3 2018 8 1 18
    3 2018 9 1 20
    end
    Where if I only want to keep observations that have no missing data in var1 for the months of July, August and September (7,8,9) in 2018, the final output ought to be

    Code:
    clear all
    ssc install dataex
    input long id float year float month float day float var1
    1 2017 7 1 125
    1 2017 8 1 200
    1 2017 9 1 108
    1 2018 4 1 20
    1 2018 5 1 50
    1 2018 7 1 73
    1 2018 8 1 18
    1 2018 9 1 20
    3 2017 7 1 20
    3 2017 8 1 .
    3 2017 9 1 .
    3 2018 7 1 73
    3 2018 8 1 18
    3 2018 9 1 20
    Thank you very much for your help.

  • #2
    I don't think bysort is necessary, I think you're looking for something like this:
    Code:
    gen tag = missing(var1) & inrange(month,7,9) & year == 2018
    drop if tag
    or just -drop if- with the right conditions, but generating a tag first allows you to check whether it will really drop the observations you want.

    Comment

    Working...
    X