Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Repeat values in panel data

    Hi all,

    I'm working with panel data. There's one variable which contains time invariant data. The values for different respondents were not obtained in the same wave though. Some people answered in wave 1, some in wave 7 , some in wave 3, etc.

    What I'm trying to do now is the following:
    1. Write the existing value into all other waves for the same respondent. E.g., if there is a value (let's say 2) in wave 3 for respondent r1, copy the value 2 into all other waves for respondent 1 aswell.
    If there is a value (7) in wave 5 for respondent r2, copy the value 7 into all other waves for respondent 2, too.

    In addition to that I'm not sure whether I should tackle this issue before or after reshaping into wide format.

    Anyone got an idea how you could solve this problem?

  • #2
    As you do not show example data, I will write code for what I imagine your data to look like. If my attempts at telepathy have failed, as they often do, we will both have wasted our time. In the future, always show example data when asking for help with code, and always use -dataex- to do so. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Code:
    by respondent_id (one_variable), sort: replace one_variable = one_variable[1]
    Note: This code only works if one_variable is a numeric variable. If it is a string, replace 1 by _N.

    Added note: If a given respondent has responses to one_variable in more than one wave and those responses disagree, then the question you raise is ill-posed. The code shown here will give the numerically smallest value as the result. That may be inappropriate, so you should give some thought to this possibility, or verify that it does not occur in your data.

    In addition to that I'm not sure whether I should tackle this issue before or after reshaping into wide format.
    You definitely should do this before reshaping. That said, why are you reshaping to wide layout? That is usually a mistake in Stata. The vast majority of data management and analysis works best, or only, with long data. Unless you are certain that you will be doing some of the few things that Stata handles best in wide layout, you should leave your panel data long.
    Last edited by Clyde Schechter; 18 Jul 2022, 10:40.

    Comment


    • #3
      Thank you for your answer, Clyde. Works like a charm. Don't give up on telepathy just yet. I will keep your tips on -dataex- in mind for future posts, too.
      Added note: If a given respondent has responses to one_variable in more than one wave and those responses disagree, then the question you raise is ill-posed. The code shown here will give the numerically smallest value as the result. That may be inappropriate, so you should give some thought to this possibility, or verify that it does not occur in your data.
      That is indeed the case. Funnily enough though, I would have had to recode it in a way that only the smallest value would remain anyway. So the curse turned out to be a blessing in my case.

      Comment

      Working...
      X