Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Panel data: attrition and unbalanced data

    Dear statalisters,

    I am working on data from labor market panel survey that is repeated for three years: 1998, 2006 and 2012.
    I am trying to measure the change in commute time from 2006 to 2012.
    So, I dropped the year 1998 and when it comes to attrition. I kept those repeated in the three rounds and the two rounds. Due to the attrition rate from 1998 to 2012, the data turned to unbalanced.
    When I worked only with 2006 and 2012 and dropped the individuals from 1998 who continued in 2006 and 2012. I got a strongly balanced panel.

    My first question is:
    Is it okay to drop individuals from 1998 and be limited only to those interviewed in 2006 and continued to 2012?

    Second question:
    I will do the empirical analysis only on wage workers. Do I need to drop the non-wage workers before xtset? When I drop it before xtset for the sample of 2006 and 2012 only, I also get an unbalanced panel.
    would it be okay to add an if wageworker==1 in the commands of xt instead of dropping them before xtset?


    Best,
    Maye
    Last edited by Maye Ehab; 16 Mar 2017, 02:00.

  • #2
    Maye:
    welcome to the list.
    In general, dropping data to obtain a balanced panel is not advisable for two main reasons, at least:
    - by dropping data, you end up with a subsample that may be quite different from the original sample;
    - Stata can handle both balanced and unbalanced panels with no problems.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Thank you Carlo for your reply.

      Originally posted by Carlo Lazzaro View Post
      Maye:
      welcome to the list.
      In general, dropping data to obtain a balanced panel is not advisable for two main reasons, at least:
      - by dropping data, you end up with a subsample that may be quite different from the original sample;
      - Stata can handle both balanced and unbalanced panels with no problems.
      If I understand correctly you advise me not to drop 1998 individuals and to select wage workers from the beginning before xtset.
      Afterwards, run my models on unbalanced data on all the individuals that continue in the three rounds
      So, I will have a sample of
      individuals interviewed in 2006 and 2012 and not 1998
      individuals interviewed in 1998, 2006 and 2012
      I am still excluding those only interviewed in only one of the three years. i.e. only in 2006...

      I will be running an xttobit...
      Is there a special command that deals with unbalanced data?

      Thank you again.

      Maye

      Comment


      • #4
        Maye:
        ​​​​​there's such a special command, because Stata handles unbalanced panel as well.
        Dropping observations might be painless provided that they are uninformative.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          just clarifying Carlo's response; there should be a "not" between "there's" and "such" - i.e., Stata's panel commands handle both balanced and unbalanced data with no special syntax needed

          Comment


          • #6
            Rich is correct.
            Thanks for amending it.
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment


            • #7
              Many thanks...

              Comment

              Working...
              X