Dear all
I have a dataset that looks like this
ticker cusip oftic cname dilfac pdi sdates yr
A1Z 04334810 ARZMF ARVIND MILLS 1 P 19dec2002 2002
A1Z 04334810 ARZMF ARVIND MILLS 1 D 18aug2005 2005
A1Z 04334810 ARZMF ARVIND LTD 1 D 15may2008 2008
A1Z 04334810 ARZMY ARVIND LTD 1 D 18jun2009 2009
AA 02224910 AA ALCOA 1 P 17mar1988 1988
AA 02224910 AA ALCOA 1 D 19feb1998 1998
AA 01381710 AA ALCOA INC. 1.006 D 14jan1999 1999
AA 01381710 AA ALCOA INC. 1.018 D 20jan2000 2000
AA 01381710 AA ALCOA INC. 1.006 D 18jan2001 2001
AA 01381710 AA ALCOA INC. 1 D 17jan2002 2002
AA 01381710 AA ALCOA INC 1 D 14mar2013 2013
Ticker is the primary firm identifier. The sdates (start date) is the date when a firm has been followed in the database. The pdi variable shows if the firm is followed using a primary basis (P) or diluted basis (D).
To illustrate (example):
The start date here for A1Z firm shows that in year 2002 the firm has been followed using a P basis, in 2005 using a D basis, in 2008 a D basis and then in 2009 a D basis again.
The years in between are assumed to follow the most recent entry, so in my example firm A1Z had a P basis in 2002, then a D basis in 2005, therefore all years in between (2003 and 2004) are on a P basis, but they are not entered !
The problem is that these years (in between) are not in the dataset, but we know them based on this assumption. The problem is also not limited to years, some firms have a different basis during the one year (over the months) . However, same assumption: all months that are not in the data follows that last month-year in the data, until a new month-year is reported, then they follow the new month-year, and so on.
My question,
How to add the missing months-years based on this assumption ?
I think the solution will be by creating months-years that fill the missing months-years, and then apply a certain criteria the satisfy my assumption here.
Note that in my data, there are months and years, but not in my example.
I attached a large set of my data to this post.
Hope someone can help, please !
Thanks a lot
I have a dataset that looks like this
ticker cusip oftic cname dilfac pdi sdates yr
A1Z 04334810 ARZMF ARVIND MILLS 1 P 19dec2002 2002
A1Z 04334810 ARZMF ARVIND MILLS 1 D 18aug2005 2005
A1Z 04334810 ARZMF ARVIND LTD 1 D 15may2008 2008
A1Z 04334810 ARZMY ARVIND LTD 1 D 18jun2009 2009
AA 02224910 AA ALCOA 1 P 17mar1988 1988
AA 02224910 AA ALCOA 1 D 19feb1998 1998
AA 01381710 AA ALCOA INC. 1.006 D 14jan1999 1999
AA 01381710 AA ALCOA INC. 1.018 D 20jan2000 2000
AA 01381710 AA ALCOA INC. 1.006 D 18jan2001 2001
AA 01381710 AA ALCOA INC. 1 D 17jan2002 2002
AA 01381710 AA ALCOA INC 1 D 14mar2013 2013
Ticker is the primary firm identifier. The sdates (start date) is the date when a firm has been followed in the database. The pdi variable shows if the firm is followed using a primary basis (P) or diluted basis (D).
To illustrate (example):
The start date here for A1Z firm shows that in year 2002 the firm has been followed using a P basis, in 2005 using a D basis, in 2008 a D basis and then in 2009 a D basis again.
The years in between are assumed to follow the most recent entry, so in my example firm A1Z had a P basis in 2002, then a D basis in 2005, therefore all years in between (2003 and 2004) are on a P basis, but they are not entered !
The problem is that these years (in between) are not in the dataset, but we know them based on this assumption. The problem is also not limited to years, some firms have a different basis during the one year (over the months) . However, same assumption: all months that are not in the data follows that last month-year in the data, until a new month-year is reported, then they follow the new month-year, and so on.
My question,
How to add the missing months-years based on this assumption ?
I think the solution will be by creating months-years that fill the missing months-years, and then apply a certain criteria the satisfy my assumption here.
Note that in my data, there are months and years, but not in my example.
I attached a large set of my data to this post.
Hope someone can help, please !
Thanks a lot
Comment