Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generate longitudinal weights using subsequent observations

    Hi everyone,

    I hope you're all well. I need to generate weights for longitudinal analysis. For that, I need to multiply the cross-sectional weight of a given individual the first year I observe them in my observation window, and multiply it by the inverse probability of that individual staying in the sample in every subsequent year, until the last year this individual is observed. To illustrate, see below (inspired by my real data but changed for anonymity purposes).

    Click image for larger version

Name:	Screenshot 2023-01-19 at 12.04.47.png
Views:	1
Size:	26.8 KB
ID:	1697855


    Take the example of the person with id==2. The variable I want to create is lw. For that, I need to multiply lworiginal (of value 12268.77 in this example) by every single value of the variable "hh_stayprob" from 2002 to 2018 (in this specific case, please ignore the discrepancy between 2009 and 2017; let us just imagine that it is all continuous from 2001 to the end of the observation window and that 2017 is 2010 and 2018 is 2011). In that specific case of course an option that works is:

    gen lw=.
    by pid: replace lw=lworiginal*hh_stayprob[_n+1]*hh_stayprob[_n+2]*hh_stayprob[_n+3]*hh_stayprob[_n+4]*hh_stayprob[_n+5]*hh_stayprob[_n+6]*hh_stayprob[_n+7]*hh_stayprob[_n+8]*hh_stayprob[_n+9]*hh_stayprob[_n+10]


    The issue is, this would work for this specific case, but different individuals have different lengths of stays in my dataset. Some individuals stay only two or three years in the sample. Others stay longer, up to 30+ years. Therefore, this command is specifically applicable to the case of an individual observed for 11 years like in this example, but does not work in other cases. Essentially what I am after, and have been struggling with, is a generic command that would work in all cases such that, regardless of how long an individual is observed for, I can have:

    by pid: lw=lworiginal*hh_stayprob[_n+1]*....*hh_stayprob[_n=_N]

    Would anyone please be able to help? I thought I would be able to resolve that easily. I would be super grateful for any advice!

    Thank you in advance,

    Guillaume Paugam

  • #2
    I'm confused by your presentation here. Looking at id 2, it seems that the value of lworiginal is specified in only the first observation. It is unclear to me whether your intent is to calculate a time-varying lw, with a different value in each observation for id 2 (a running product calculated over a window with a moving beginning), or a single result that applies to all of the observations of id 2 and is simply the result of an iterated product over all of id 2's observations. The approaches to these differ.

    Even if your question were clear, however, the example data shown is not usable because it is a screenshot. Please read the Forum FAQ for excellent advice on the most effective ways to provide information about data, code, and results on this Forum. Screenshots are especially discouraged because 1) they are not reliably readable on everybody's setup (though this one happens to be readable on mine) and 2) there is no way to import data from a screenshot into Stata to develop and test code with it. The helpful way to show example data is with the -dataex- command. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    If you post back with a clarification of what is wanted, and a data example using -dataex-, I am confident you will get a timely and helpful response.
    Last edited by Clyde Schechter; 19 Jan 2023, 11:00.

    Comment

    Working...
    X