Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Stata data manipulation filtering, summing and removing duplicates from quarterly observation data

    Hello dear Statalist,

    I am quite new with Stata and I could really need help for my thesis.

    I have variables called "cusip" (firm code), "mgrno" (numerical code for investor (unique)), "rdate" (date), "typecode" (classification coding from 1-5 based on investor type), "shares" (number of shares held by each investor in a firm), and shrout2 (total shares for each firm in 1000).

    However, some investors report their holdings multiple times a year (as can be seen from below), so if I would try to sum all the shares based on typecode for each firm, many would appear multiple times since they are reported quarterly. Not all however, so how can I do this? If I would like to get the total ownership for each type of investor for every firm for the latest date the ownership has been reported for example.

    In the end, I would like to have a database of the ownership % (shares held by each type in each firm/total shares for firm) for each of the 5 investor types for each firm for that year.

    Could someone help me?

    When I tried dropping rdate duplicates based on mgrno and cusip, I lost necessary observations.


    Click image for larger version

Name:	Screenshot 2023-04-13 at 23.27.43.png
Views:	1
Size:	486.3 KB
ID:	1709735

    Attached Files

  • #2
    If I would like to get the total ownership for each type of investor for every firm for the latest date the ownership has been reported for example.
    Try:

    Code:
    bysort cusip mgrno typecode (rdate): keep if _n == _N

    Comment

    Working...
    X