Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Deleting a portion of panel observations after a condition is met

    Hi there

    I am working on a panel data set with multiple observations on each individual, identified by a unique ID variable (called ''pid''). I also have a variable denoting labour force status (called ''labforcestat'' which can be equal to 1, 2, or 3, for each individual at each wave. An individual's status can change between waves. Furthermore, observations on a person may not be continuous (there may be missing waves).

    I want to delete the current observation and all subsequent observations on an individual from the point that their labour force status is 2 for a second time. However, I want to keep observations on that person, prior to this occuring. For example, in the data set below, I would want to keep all observations on person 1, but delete the observations on person 2 that occurred in waves 7 and 8.

    If it is relevant/helpful, I have managed to create a variable called newspell which equals 1 when an individual's labour force status changes (or the first time they are observed) and is missing otherwise.

    I am completely stuck, so any help you can offer would be wonderful. If I have not explained clearly enough, I apologise, and please feel free to ask for more clarification.
    pid wave labforcestat
    person 1 1 1
    person 1 2 2
    person 1 3 2
    person 1 4 3
    person 2 1 1
    person 2 2 2
    person 2 5 1
    person 2 7 2
    person 2 8 3

  • #2
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str8 pid byte(wave labforcestat)
    "person 1" 1 1
    "person 1" 2 2
    "person 1" 3 2
    "person 1" 4 3
    "person 2" 1 1
    "person 2" 2 2
    "person 2" 5 1
    "person 2" 7 2
    "person 2" 8 3
    end
    
    by pid (wave), sort: gen count_of_2s = sum(labforcestat == 2)
    drop of count_of_2s >= 2
    In the future, please post example data by using the -dataex- command, as I have done here. You can install it by running -ssc install dataex- and then run -help dataex- to read the simple instructions for using it. HTML tables can be difficult to import into Stata if somebody needs to work with your data in order to develop and test an answer to your question. In addition, HTML tables leave important information out: is something that looks like a string actually a string or is it a value-labeled numeric variable? What is the underlying encoding of a labeled variable? etc. By using -dataex- you enable those who want to help you to complete a completely faithful replica of your Stata data example with just a simple copy/paste operation.

    Comment


    • #3
      Thank you very much for your help, and I apologize for not posting in the correct format - I will be sure to use the dataex command in future!

      Comment


      • #4
        See also http://www.stata.com/support/faqs/da...t-occurrences/

        Comment


        • #5
          Hello Jay,

          I fear I didn't get it right, for you said

          want to delete the current observation and all subsequent observations on an individual from the point that their labour force status is 2 for a second time. However, I want to keep observations on that person, prior to this occuring. For example, in the data set below, I would want to keep all observations on person 1, but delete the observations on person 2 that occurred in waves 7 and 8.
          However, person 1, according to your example, has force status = 2 twice as well.
          Best regards,

          Marcos

          Comment

          Working...
          X