Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Panel Data with missing waves

    Dear All,
    I'm analyzing panel data using xtologit, the maximum number of waves is 3, but I have respondents with only one wave.
    The question is: what is the best choice for analysis?
    1- Drop respondents with only one wave.
    2- use tsfill to impute the missing waves.
    Thank you very much

  • #2
    Using -tsfill- will not resolve the problem, because -tsfill- will just put missing values into all of the variables (except the time variable) in the additional observations it creates.

    The problem of missing data is a vexatious one. There are no good solutions. There are bad ones and worse ones, and one tries to find the least bad solution that is feasible in your own circumstances. Perhaps the most important thing is to try to understand how the missing data came to be missing. In the very lucky circumstance where the missingness is totally exogenous, you can just ignore it altogether and work with the data as it is. If, however, the missingness arises in ways that make it plausibly related to the actual unobserved values of the missing responses, then such an analysis, or an analysis restricted to complete cases only (your optioin 1) leads to biased results. In some circumstances with longitudinal data, there may be accepted patterns of evolution of the variables over time that make it possible to fill them in, such as interpolation. But more commonly, variables are not so well behaved as that. If the missing data can be reasonably considered to be independent of the actual unobserved values of the missing responses when you condition on other observed variables in the model, then multiple imputation can reduce that bias. (Unfortunately, however, Stata's multiple imputation command does not work with -xtologit-, so you might have to modify your model in some way to make this all work.)

    I recommend you read https://statisticalhorizons.com/wp-c...aterials-1.pdf to get an overview of the logic of handling missing data and some examples of how different approaches can be applied in Stata.

    Comment


    • #3
      I note in addition to Clyde Schechter that interpolation from one wave to two or three can only result in repeating the one value you have for each person.

      Imputation is a different matter. but if 2/3 of the data for some people are imputed, are you much better off?

      Unfortunately, commentary here can only underline what is already clear, that your dataset sounds problematic.

      Comment


      • #4
        Thank you for your reply,
        I cant apply interpolation or multiple imputation because my variables are categorical. Most of missing values are due to the missing waves for lots of respondents. Some respondents are met only once and it is not an attrition
        that's why I'm thinking about dropping their records but I don't know if it will be acceptable.
        I appreciate if you can tell me any advice about my case
        Thank you

        Comment


        • #5
          Sarah:
          I'm not clear with your problem, that looks more like an unbalanced panel situation.
          That said, unless you have sound methodological reasons to delete one-wave data only observations, I'd leave things as they are, to avoid making-up your dataset.
          Kind regards,
          Carlo
          (Stata 19.0)

          Comment


          • #6
            Thank you Carlo

            Comment

            Working...
            X