Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generating new variable with mean values sorted by Date and TICKER

    Hi,

    So I have information about how many people are holding each stock (there are about 8200 stocks in the file) and the information was hourly before, but I narrowed it down to just the day.

    Below is just for one day for one stock with it's users:

    date TICKER users_holding
    12feb2019 ZYXI 0
    12feb2019 ZYXI 1
    12feb2019 ZYXI 4
    12feb2019 ZYXI 6
    12feb2019 ZYXI 8
    12feb2019 ZYXI 9
    12feb2019 ZYXI 10
    12feb2019 ZYXI 12
    12feb2019 ZYXI 12
    12feb2019 ZYXI 12

    So this is the same for all the other 8200 stocks I have in my data and I want to find the mean for each day and TICKER so that I can remove all duplicates and have it look like the below:


    date TICKER users_holding
    12feb2019 ZYXI 7,4


    I have used this code:
    Code:
    bysort date date: egen holding=mean(users_holding)
    But all it does is it calculates the mean for each date with no respects to the TICKER.

    Anyone who can help me with creating a mean value that sorts by date and TICKER and then how I can remove those duplicates with respect to date and TICKER since I will then get:

    date TICKER users_holding
    12feb2019 ZYXI 7,4
    12feb2019 ZYXI 7,4
    12feb2019 ZYXI 7,4
    12feb2019 ZYXI 7,4
    12feb2019 ZYXI 7,4
    12feb2019 ZYXI 7,4
    12feb2019 ZYXI 7,4
    12feb2019 ZYXI 7,4
    12feb2019 ZYXI 7,4
    12feb2019 ZYXI 7,4

    Thank you very much in advance!

    Best regards

    Mathias Sorensen

  • #2
    Code:
    bysort TICKER date: egen holding=mean(users_holding)
    In the future, when showing data examples, please use the -dataex- command to do so. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Comment


    • #3
      Hi Clyde

      Thanks very much for noticing that I made a mistake and wrote date twice, code worked perfectly!

      And thank you for the tip with -dataex- I have that installed now for future posts and I do apologize for the lack of this in my recent post.

      Mathias Sorensen

      Comment

      Working...
      X