Adding rows with criteria

Luca Gagliardone

Join Date: Oct 2016
Posts: 54

Adding rows with criteria

13 Jan 2017, 01:48

Dear all,

I am working on a panel dataset, in which the years (time dimension) available are 2001 2003 2005 2007 2009 and so on, every time skipping one year.
To make an example of the dataset:

ID	Year	VarA	VarB
1	2001	10	5
1	2003	20	15
1	2005	30	25
2	2001	15	10
2	2003	25	20
2	2005	35	30
3	2001	60	65
3	2003	70	75
3	2005	80	85

I would like to insert for each ID a row for the missing years (2002 and 2004 in the example above) and for some selected variables (VarA and VarB in example) fill the gaps with the mean of the previous value with the following value. To give an example of the final result:

ID	Year	VarA	VarB
1	2001	10	5
1	2002	15	10
1	2003	20	15
1	2004	25	20
1	2005	30	25
2	2001	15	10
2	2002	20	15
2	2003	25	20
2	2004	30	25
2	2005	35	30
3	2001	60	65
3	2002	65	70
3	2003	70	75
3	2004	75	80
3	2005	80	85

Do you know how to do this?
Thanks in advance for the help!

Luca.

Tags: add observations, add rows, loop, panel

Nick Cox

Join Date: Mar 2014
Posts: 35734

13 Jan 2017, 02:16

It can be done, but it's usually a bad idea. Just declare your data as

Code:

tsset ID Year, delta(2)

Otherwise anything you do has model figures of merit and degrees of freedom that are spurious.

I'll explain how to do it nevertheless, because sometimes the motive is wanting to merge with other data that are yearly

Note that while your data example is manageable, using dataex (SSC) is the recommended way to do it. This is documented (FAQ Advice #12).

Code:

clear 
input ID    Year    VarA    VarB
1    2001    10    5
1    2003    20    15
1    2005    30    25
2    2001    15    10
2    2003    25    20
2    2005    35    30
3    2001    60    65
3    2003    70    75
3    2005    80    85
end 
local Np1 = _N + 1
expand 2 
bysort ID (Year) : replace Year = Year + 1 if Year == Year[_n-1] 

foreach v in VarA VarB { 
    replace `v' = . if mod(Year, 2) == 0 
    bysort ID: ipolate `v' Year, gen(`v'2) 
} 

list, sepby(ID) 

     +-----------------------------------------+
     | ID   Year   VarA   VarB   VarA2   VarB2 |
     |-----------------------------------------|
  1. |  1   2001     10      5      10       5 |
  2. |  1   2002      .      .      15      10 |
  3. |  1   2003     20     15      20      15 |
  4. |  1   2004      .      .      25      20 |
  5. |  1   2005     30     25      30      25 |
  6. |  1   2006      .      .       .       . |
     |-----------------------------------------|
  7. |  2   2001     15     10      15      10 |
  8. |  2   2002      .      .      20      15 |
  9. |  2   2003     25     20      25      20 |
 10. |  2   2004      .      .      30      25 |
 11. |  2   2005     35     30      35      30 |
 12. |  2   2006      .      .       .       . |
     |-----------------------------------------|
 13. |  3   2001     60     65      60      65 |
 14. |  3   2002      .      .      65      70 |
 15. |  3   2003     70     75      70      75 |
 16. |  3   2004      .      .      75      80 |
 17. |  3   2005     80     85      80      85 |
 18. |  3   2006      .      .       .       . |
     +-----------------------------------------+

Comment

Luca Gagliardone

Join Date: Oct 2016

Posts: 54
#3

13 Jan 2017, 02:54

Originally posted by Nick Cox View Post

It can be done, but it's usually a bad idea.

It is not for estimation, it should work fine!

Thanks, the code is perfect!

Luca.
Comment

Announcement

Adding rows with criteria

Comment

Comment