Hello everybody,
I would really appreciate any help someone can provide with some coding. I am trying to create a decay measure of a variable that will decrease each year by 1/n (n=the number of years the current observation is away from the initial value). For example, if this is my dataset:
id | year | variable_to_decay
2 | 2006 | .
2 |2007 |1
2 |2008 |.
2 |2009 |.
2 |2010 |1
2 |2011 |.
2 |2012 |.
2 |2013 |.
2 |2014 |.
2 |2015 |2
2 |2016 |1
2 |2017 |.
2 |2018 |.
3 |1997 |.
3 |1998 |1
3 |1999 |.
3 |2000 |.
3 |2001 |.
3 |2002 |.
3 |2003 |.
3 |2004 |.
3 |2005 |.
3 |2006 |.
3 |2007 |.
3 |2008 |.
3 |2009 |.
3 |2010 |2
3 |2011 |.
then I am trying to make a new variable with values that decay the prior value by 1/2, 1/3, 1/4, 1/5,... but also add the decayed value with the value of the current observation (0 if missing). So I want the decayed_variable values in my example to look like the following (explanations for calculation in parentheses):
id | year | variable_to_decay | decayed_variable
2 | 2006 | . | 0
2 | 2007 | 1 | 1
2 | 2008 | . | 0.5
2 | 2009 | . | 0.33
2 | 2010 | 1 | 1.25 (1 + 0.25 since the "1" from 2007 is decayed to 0.25)
2 | 2011 | . | 0.7 (1/2 + 1/5 since the 1 from 2010 is now 1/2 and the 1 from 2007 is now 1/5)
2 | 2012 | . | 0.497 (1/3 + 1/6)
2 | 2013 | . | 0.393 (1/4 + 1/7)
2 | 2014 | . | 0.325 (1/5 + 1/8)
2 | 2015 | 2 | 2.278 (2 + 1/6 + 1/9)
2 | 2016 | 1 | 2.243 (1 + 1 + 1/7 + 1/10)
2 | 2017 | . | 1.216 (1/2 + 1/2 + 1/8 + 1/11)
2 | 2018 | . | 0.086 (1/3 + 1/3 + 1/9 + 1/12)
3 | 1997 | . | 0
3 | 1998 | 1 | 1
3 | 1999 | . | 0.5
3 | 2000 | . | 0.33
3 | 2001 | . | 0.25
3 | 2002 | . | 0.2
3 | 2003 | . | 0.167
3 | 2004 | . | 0.143
3 | 2005 | . | 0.125
3 | 2006 | . | 0.111
3 | 2007 | . | 0.1
3 | 2008 | . | 0.09
3 | 2009 | . | 0.083
3 | 2010 | 2 | 2.077 (2 + 1/12)
3 | 2011 | . | 1.571 (1 + 1/13)
Apologies for the long post. I'd be very grateful is anyone knows how to do this and can provide some advice. My initial thought was that I would have to create multiple variables and then sum them but I got stuck due to the fact of the randomness of the observation values taking on a value that is not missing or not equal to 0 and because some groups (id) has more or less observations.
I would really appreciate any help someone can provide with some coding. I am trying to create a decay measure of a variable that will decrease each year by 1/n (n=the number of years the current observation is away from the initial value). For example, if this is my dataset:
id | year | variable_to_decay
2 | 2006 | .
2 |2007 |1
2 |2008 |.
2 |2009 |.
2 |2010 |1
2 |2011 |.
2 |2012 |.
2 |2013 |.
2 |2014 |.
2 |2015 |2
2 |2016 |1
2 |2017 |.
2 |2018 |.
3 |1997 |.
3 |1998 |1
3 |1999 |.
3 |2000 |.
3 |2001 |.
3 |2002 |.
3 |2003 |.
3 |2004 |.
3 |2005 |.
3 |2006 |.
3 |2007 |.
3 |2008 |.
3 |2009 |.
3 |2010 |2
3 |2011 |.
then I am trying to make a new variable with values that decay the prior value by 1/2, 1/3, 1/4, 1/5,... but also add the decayed value with the value of the current observation (0 if missing). So I want the decayed_variable values in my example to look like the following (explanations for calculation in parentheses):
id | year | variable_to_decay | decayed_variable
2 | 2006 | . | 0
2 | 2007 | 1 | 1
2 | 2008 | . | 0.5
2 | 2009 | . | 0.33
2 | 2010 | 1 | 1.25 (1 + 0.25 since the "1" from 2007 is decayed to 0.25)
2 | 2011 | . | 0.7 (1/2 + 1/5 since the 1 from 2010 is now 1/2 and the 1 from 2007 is now 1/5)
2 | 2012 | . | 0.497 (1/3 + 1/6)
2 | 2013 | . | 0.393 (1/4 + 1/7)
2 | 2014 | . | 0.325 (1/5 + 1/8)
2 | 2015 | 2 | 2.278 (2 + 1/6 + 1/9)
2 | 2016 | 1 | 2.243 (1 + 1 + 1/7 + 1/10)
2 | 2017 | . | 1.216 (1/2 + 1/2 + 1/8 + 1/11)
2 | 2018 | . | 0.086 (1/3 + 1/3 + 1/9 + 1/12)
3 | 1997 | . | 0
3 | 1998 | 1 | 1
3 | 1999 | . | 0.5
3 | 2000 | . | 0.33
3 | 2001 | . | 0.25
3 | 2002 | . | 0.2
3 | 2003 | . | 0.167
3 | 2004 | . | 0.143
3 | 2005 | . | 0.125
3 | 2006 | . | 0.111
3 | 2007 | . | 0.1
3 | 2008 | . | 0.09
3 | 2009 | . | 0.083
3 | 2010 | 2 | 2.077 (2 + 1/12)
3 | 2011 | . | 1.571 (1 + 1/13)
Apologies for the long post. I'd be very grateful is anyone knows how to do this and can provide some advice. My initial thought was that I would have to create multiple variables and then sum them but I got stuck due to the fact of the randomness of the observation values taking on a value that is not missing or not equal to 0 and because some groups (id) has more or less observations.