Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Drop variables in STATA if next year data is missing

    Hello, I need some help

    I have large unbalance panel data and I need to drop the Person ID (PID) if its next year data is missing. See the sample:

    PID Year Explanatory variable
    1 2009 12
    1 2010 12
    1 2011 7
    1 2012 8
    1 2013 14
    2 2009 12
    3 2009 45
    3 2010 56
    3 2011 34
    3 2012 23
    4 2009 212
    5 2009 12
    5 2010 34

    So at time 1 which is 2009, I am considering data of Persons then its impact will be on the Dependent Variable next year data time 2. So, if PID is missing in next year these observations are useless for me. Example PID: 2 and 4 above. How I can drop them I cannot do it manually its huge data base.

    Last edited by Subhan Shahid; 10 Apr 2020, 02:16.

  • #2
    There are no missing values here, just gaps or absent data. Further, your problem is about dropping observations, not variables.

    I am still unclear about your intended result. Suppose you drop an observation for 2012 because there is no 2013 value. Then you need to drop any observation for 2011 because there is no 2012 value.

    There is code to select complete panels only and code to identify runs of contiguous observations https://www.stata.com/support/faqs/d...-observations/ but I can't tell what you want.


    Comment


    • #3
      Originally posted by Nick Cox View Post
      There are no missing values here, just gaps or absent data. Further, your problem is about dropping observations, not variables.

      I am still unclear about your intended result. Suppose you drop an observation for 2012 because there is no 2013 value. Then you need to drop any observation for 2011 because there is no 2012 value.

      There is code to select complete panels only and code to identify runs of contiguous observations https://www.stata.com/support/faqs/d...-observations/ but I can't tell what you want.

      Thank you, I need some more information, yes I am taking about deleting observations. Only those PID observations whose data is only available for the year 2009. So atleast I need to keep the PID for minimum two years. and 2009 is always the base, it cannot be 2010 for 2011. For precisely in 2009 I captures individuals personality characteristics and in coming yeas 2010-2013 their intentions to enter into entrepreneurship. I hope you can guide me how to delete these observation containing only one year data for any specific PID.

      Comment


      • #4
        Code:
        bysort PID : drop if _N == 1
        drops single-observation panels.

        Comment

        Working...
        X