Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Average across different date intervals

    Dear all,

    I am a new user of Stata that is facing some difficulties! I am asking for help :-)

    I have the following dataset:
    Click image for larger version

Name:	dataset.PNG
Views:	1
Size:	24.4 KB
ID:	1458103
    • Fund ID represents the identification of a particular fund (can be 3, 10, 1512, etc.).
    • TDate represents the date at which the flow of the column NetCFs occured. These dates are different across funds.
      • For each fund, I thus have a different start date (for fund ID 3, it is 12/31/1993, the date of the first flow) and end date (for fund ID 3, it is let's say 05/15/2001, the date of the last flow (not displayed)).
    • Index_return represents the return over a benchmark (e.g. the S&P500) for each "IndexDate" (from 01/01/1980 to today).
    I would like to compute, for each fund (each FundID), the average "Index_return" from its start date (for fund ID 3, it is 12/31/1993) to its end date (05/15/2001).
    In other words, for fund ID 3, I would like the average of "Index_return" over 12/31/1993, 01/01/1994, 01/02/1994 until 05/15/2001.

    I hope I was clear and I would be very pleased if you could help!

    Thank you in advance,

    Best regards,
    Max

  • #2
    Bump

    Comment


    • #3
      Code:
      by FundID, sort: egen average_index_return = mean(Index_return)
      In the future, when showing data examples, please use the -dataex- command to do so. If you are running version 15.1 or a fully updated version 14.2, it is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

      Please, in particular, avoid showing data with screenshots. While yours was readable (many are not), it is impossible to import data from a screenshot into Stata to test out code. Had your problem required a more complicated solution, it would not have been possible to try it out on your example data, which, for many of us, is a deterrent to responding to the question t all.

      When asking for help with code, always show example data. When showing example data, always use -dataex-.

      Comment


      • #4
        Originally posted by Clyde Schechter View Post
        Code:
        by FundID, sort: egen average_index_return = mean(Index_return)
        In the future, when showing data examples, please use the -dataex- command to do so. If you are running version 15.1 or a fully updated version 14.2, it is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

        Please, in particular, avoid showing data with screenshots. While yours was readable (many are not), it is impossible to import data from a screenshot into Stata to test out code. Had your problem required a more complicated solution, it would not have been possible to try it out on your example data, which, for many of us, is a deterrent to responding to the question t all.

        When asking for help with code, always show example data. When showing example data, always use -dataex-.
        Dear Clyde,

        Thank you for the answer and sorry for the wrong process! Will do it right in the future :-).
        I am afraid your code is not exactly what I want as the average must be computed over a different timeframe for each fund, timeframe that depends on each fund start date (date of their first flow) and end date (last flow).
        Will post a sample set in the right format tomorrow (not access to Stata today).

        Thanks!
        Max

        Comment


        • #5
          Then you perhaps should explain better what you want. The code for averaging over all observations of a given firm does not involve the dates at all. You have said nothing to indicate that the time frame is anything other than all of the observations of the firm. What else could "from its start date to its end date" mean in the absence of any variables designating other dates for start and end? So the code shown in #2 will do what you asked for. Apparently, what you asked for is not what you want.

          Perhaps when reposting with a data example tomorrow, you might also show a fully worked example of what you are looking for so it will be clearer. Explaining things in words can be difficult and frustrating.

          Comment


          • #6
            Originally posted by Clyde Schechter View Post
            Then you perhaps should explain better what you want. The code for averaging over all observations of a given firm does not involve the dates at all. You have said nothing to indicate that the time frame is anything other than all of the observations of the firm. What else could "from its start date to its end date" mean in the absence of any variables designating other dates for start and end? So the code shown in #2 will do what you asked for. Apparently, what you asked for is not what you want.

            Perhaps when reposting with a data example tomorrow, you might also show a fully worked example of what you are looking for so it will be clearer. Explaining things in words can be difficult and frustrating.
            Dear Clyde,

            Please find an exemple from Stata here under:

            Code:
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input byte FundID int(TransactionDate Flow) byte D int Indexdate double Indexreturn
            1 1981 -100 . 1980   .02
            1 1983    0 . 1981  .019
            1 1984  100 . 1982 .0152
            1 1990  150 . 1983 .0122
            2 1982 -110 . 1984 .0097
            2 1983    0 . 1985 .0078
            2 1985  -50 . 1986 .0093
            2 1988  100 . 1987 .0112
            2 1990  150 . 1988 .0134
            2 1992   20 . 1989 .0161
            3 2000 -150 . 1990 .0194
            3 2001    0 . 1991 .0232
            3 2003    0 . 1992 .0186
            3 2005  350 . 1993 .0149
            .    .    . . 1994 .0119
            .    .    . . 1995 .0095
            .    .    . . 1996 .0076
            .    .    . . 1997 .0061
            .    .    . . 1998 .0049
            .    .    . . 1999 .0039
            .    .    . . 2000 .0031
            .    .    . . 2001 .0025
            .    .    . . 2002  .002
            .    .    . . 2003 .0016
            .    .    . . 2004 .0013
            .    .    . . 2005  .001
            end
            I have 3 funds: Fund ID 1, Fund ID 2 and Fund ID 3
            For Fund ID 1, I need to compute average of the S&P return (column Indexreturn) from 1981 to 1990; for Fund ID 2, from 1982 to 1992 and for Fund ID 3, from 2000 to 2005.
            Each fund thus has a different start and end date, so the average computed on the S&P500 returns has to differ for each fund.
            I hope this is more clear!

            Thanks again

            Max

            Comment


            • #7
              No, this is no different from before.
              Code:
              . tabstat TransactionDate, by(FundID) statistics(min max)
              
              Summary for variables: TransactionDate
              by categories of: FundID
              
              FundID | min max
              ---------+--------------------
              1 | 1981 1990
              2 | 1982 1992
              3 | 2000 2005
              ---------+--------------------
              So FundID 1 from 1981 to 1990 is precisely all the observations for FundID 1. Similarly for FundID 2, 1982 to 1992 is precisely all the observations for Fund ID 2. The same applies to FundID 3. So in calculating the average returns within these date ranges, you don't actually have to make any reference to the range of dates: you just average all the observations for the fund. That's what the code in #3 does.

              Comment


              • #8
                Originally posted by Clyde Schechter View Post
                No, this is no different from before.
                Code:
                . tabstat TransactionDate, by(FundID) statistics(min max)
                
                Summary for variables: TransactionDate
                by categories of: FundID
                
                FundID | min max
                ---------+--------------------
                1 | 1981 1990
                2 | 1982 1992
                3 | 2000 2005
                ---------+--------------------
                So FundID 1 from 1981 to 1990 is precisely all the observations for FundID 1. Similarly for FundID 2, 1982 to 1992 is precisely all the observations for Fund ID 2. The same applies to FundID 3. So in calculating the average returns within these date ranges, you don't actually have to make any reference to the range of dates: you just average all the observations for the fund. That's what the code in #3 does.
                I am afraid not being clear enough, sorry for this :-(.
                This won’t give me what I need as the dates of fund flows do no match exactly the dates of S&P returns. This is actually two datasets... How can I tell Stata to run an average that considers the min and max dates for each fund. I think I would need a loop...?

                Comment


                • #9
                  This is actually two datasets
                  If it was created by dataex it is in a single dataset. If your data is in two datasets, then show samples of each dataset using dataex twice.

                  I note that the sample data you provide in post #6 is not the same as the picture in post #1, where the dates were daily dates, in your sample you provide yearly dates. Your sample data is only useful to the extent that it reproduces all the characteristics of your real data.

                  Comment

                  Working...
                  X