Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Thank you. I don't think that it matters if Adams in 2000 would get coded ahead of Smith in 1980, because it's all about the distinction between the managers and not about the chronological order, at least I think so.

    I also got fundreturns for periods wherein no manager managed the specific fund.
    Is it possible to drop those observations, based on the following:
    The returndata is based on a 'date' variable with a format like 1jan1990
    The manage periods of each manager are based on 4 'date' variables (starting quarter, starting year, ending quarter, ending year)

    Comment


    • #17
      How is manager coded when manager is not known? Hopefully however you do it is consistent! I would probably just start my code with something like

      drop if missing(manager)

      Or, if instead you have done something like call them "Unknown" do something like

      drop if manager == "Unknown"

      Again, I am assuming each record already has a value for a variable called manager. If that isn't true you will have to figure out how to get it added.

      I suppose the other thing that might screw you up is if Smith manages 1980-1985 and then takes over again 1990-1993. Or, worse yet, if two people with the same name manage in different periods. With my code these would get the same code for newfund, which may not be what you want for Smith's return engagement and definitely isn't what you want if the two Smiths are actually different people.
      -------------------------------------------
      Richard Williams, Notre Dame Dept of Sociology
      Stata Version: 17.0 MP (2 processor)

      EMAIL: [email protected]
      WWW: https://www3.nd.edu/~rwilliam

      Comment


      • #18
        Originally posted by LydiaSmit View Post
        Thank you. I don't think that it matters if Adams in 2000 would get coded ahead of Smith in 1980, because it's all about the distinction between the managers and not about the chronological order, at least I think so.
        I think that is right, because regardless of the order in which the regressions get run the alpha values will get added to the right cases.
        -------------------------------------------
        Richard Williams, Notre Dame Dept of Sociology
        Stata Version: 17.0 MP (2 processor)

        EMAIL: [email protected]
        WWW: https://www3.nd.edu/~rwilliam

        Comment


        • #19
          Luckily, I don't have different managers with the exact same name managing the same fund. I also don't have this kind of cases: "if Smith manages 1980-1985 and then takes over again 1990-1993".

          I used the -joinby- command based on the only variable in common 'fundID' to merge the fund returns (based on a var in this format: 31dec1991) with the fund managing periods (based on 4 vars: first quarter, first year, last quarter, last year).

          So every return of 1 fund managed by manager A, manager B, manager C is added to manager A, manager B, manager C. So, even fund returns of dates on which no manager of my dataset managed the specific fund.
          Last edited by LydiaSmit; 23 Jul 2014, 10:33.

          Comment


          • #20
            So every return of 1 fund managed by manager A, manager B, manager C is added to manager A, manager B, manager C. So, even fund returns of dates on which no manager of my dataset managed the specific fund.
            I don't understand that paragraph, and it isn't clear to me how managers got matched with the correct dates. But assuming they did, then there should be missing values when there wasn't a match and if so those records could be dropped.
            -------------------------------------------
            Richard Williams, Notre Dame Dept of Sociology
            Stata Version: 17.0 MP (2 processor)

            EMAIL: [email protected]
            WWW: https://www3.nd.edu/~rwilliam

            Comment


            • #21
              The managers aren't matched yet based on their fund managing periods. Therefore, I need help to do that. The dataset with the returns (which contains a var in this format: 31dec1991) is matched with the dataset which contains the fund managing periods (which contains 4 vars: first quarter, first year, last quarter, last year). The merge/match was done with the -joinby- command based on variable 'fundID' because fundID is the only variable which both datasets have in common. Hopefully it's clear now. I really hope you can help me.

              Comment


              • #22
                (example of the 4 vars of the 2nd dataset: first_year first_qtr last_year last_qtr -->1997 4 1999 3)

                So, maybe it helps to make an extra 'quarter' variable based on the following commands:

                1 if strmatch(date,"*jan*")
                1 if strmatch(date,"*feb*")
                1 if strmatch(date,"*mar*")
                2 if strmatch(date,"*apr*")

                and after that the columns with the quarters and the columns with years can be compared,,,somehow

                Comment


                • #23
                  To match the dates, the first issue is whether your date variable in the returns data set is a true Stata date variable, or if it is a string variable that displays in the way you described. If it is a string variable (which I suspect it is, since you are proposing to use it as the first argument of -strmatch()-), you need to first convert it to a Stata date variable:

                  Code:
                  rename date str_date
                  gen long date = date(str_date, "DMY")
                  format date %td  // This line is optional: it will make the data set easier for humans to read.
                  If date was already a Stata date variable, the next step would be to extract a quarterly date from it:

                  Code:
                  gen int quarter = qofd(date)
                  format quarter %tq // This line is optional: it will make the data set easier for humans to read.
                  Now go to the data set with the managers and make Stata quarterly dates from the ingredients you have:

                  Code:
                  gen int first_quarter = yq(first_year, first_quarter)
                  gen int last_quarter = yq(last_year, last_quarter)
                  format first_quarter last_quarter %tq // Again, optional, for human readability
                  Now "merge" the data sets with -joinby- and keep only those observations where quarter lies between first_quarter and last quarter:

                  Code:
                  use fund_returns_data_set, clear
                  joinby fund_ID using managers_data_set
                  keep if inrange(quarter, first_quarter, last_quarter)
                  Note: I haven't tested any of this code, but it should at least be a start. It may require modifications to account for missing values, out-of-range values, etc.

                  Comment


                  • #24
                    I suspected there was more to it. Hopefully, we can get you through this problem. But, if not, or if there are more problems like this lurking in the wings, I'd strongly encourage you to try to find somebody locally who can help you. You have these really complicated and somewhat unusual data base problems and it is very hard to assist you without having the data and seeing what you have already done. Plus somebody who knows something about the topic might be able to give you much better substantive advice.

                    OK, lets be clear about what the data look like. It sounds to me like the original records have fundid, year, month, and a bunch of x and y variables. If I understand you correctly, you have added to each of these records a bunch of manager variables. What to they look like? My guess would be that they are something like managera, firstyeara, firstquartera, lastyeara, lastquartera, and then managerb, firstyearb, firstquarterb, lastyearb, lastquarterb....repeated however many times you need to get the largest number of managers in, e.g. managerz, firstyearz, firstquarterz, lastyearz, lastquarterz.

                    If that is true, there may be a fighting chance of getting this to work. But if it isn't true, we need to know how the data are really structured.

                    EDIT: I didn't see Clyde's post before I posted -- Hopefully he has already solved all the problems!
                    Last edited by Richard Williams; 23 Jul 2014, 12:54.
                    -------------------------------------------
                    Richard Williams, Notre Dame Dept of Sociology
                    Stata Version: 17.0 MP (2 processor)

                    EMAIL: [email protected]
                    WWW: https://www3.nd.edu/~rwilliam

                    Comment


                    • #25
                      Thank you for the great explanation. I'll try to run all those codes within 10 minutes. (Date was already a Stata (type long) variable, however, I was thinking about making it temporarily a string to use the strmatch() command.)

                      I used all the codes in the separate datasets, it looks great!!! Thank you Clyde! Running all commands in my .do-file took longer than expected. Richard, I'm now trying to get all the alphas and I'll let you know the result as soon as I know it.
                      Last edited by LydiaSmit; 23 Jul 2014, 13:31.

                      Comment


                      • #26
                        I have my fingers crossed for Clyde's solution. I was thinking that, if a fund had had 50 managers, then 50 sets of variables got tacked on the record for that firm. Instead it sounds like 50 duplicates of each record for the firm get made (each with a different manager), which is much easier than what I had in mind. The trick is then to only keep the records where manager and fund have been matched correctly, and Clyde's code seems to do that. If that isn't the way the world operates, it is the way the world should be. Hope it works.
                        -------------------------------------------
                        Richard Williams, Notre Dame Dept of Sociology
                        Stata Version: 17.0 MP (2 processor)

                        EMAIL: [email protected]
                        WWW: https://www3.nd.edu/~rwilliam

                        Comment


                        • #27
                          Thank you Richard, I tested the alpha of a few funds and they were correct. However, there's a new problem. 1 manager only has observations for 1 date and that results in the following error: insufficient observations

                          I don't think that it's solvable. However, maybe you know a trick to still get the alpha of that manager.
                          If I don't drop that observation then the foreach command stops (and also the regressions which were going to be run after that manager (fund))

                          Another problem, Richard, you were right after all.

                          Managername fundno first_year first_qtr last_year last_qtr
                          PEDRO VERDU 103 1997 4 1999 3
                          PETER VERDU 103 1999 4 1999 4

                          That results in a different alpha even though the 2 managernames are from 1 manager
                          Last edited by LydiaSmit; 23 Jul 2014, 15:52.

                          Comment


                          • #28
                            That manager was only there a month? Hopefully he didn't drive the fund into bankruptcy!

                            You have to have at least 4 cases if you want to estimate a regression with three independent variables, and you may want more than that. Here is a possible tweak to my code:

                            Code:
                            gen alpha = .
                            drop if missing(manager)
                            egen newfund = group(fund manager)
                            bysort newfund: gen nrecs = _N
                            drop if nrecs <= 11
                            levelsof newfund, local(fundnum)
                            foreach id of local fundnum {
                                quietly regress ret MRP SMB HML if newfund == `id'
                                quietly replace alpha = _b[_cons] if e(sample)
                            }
                            On my drop if nrecs line, I basically said the manager had to be there at least a year in order to be included in the analysis, but you can make that as low as 4 or make it higher if you want. I don't know the area, but if somebody is only managing a fund for a few months I don't know how useful the alpha number is. i.e. you are more likely to get lucky or unlucky over a very short period of time but over longer periods of time luck should even out.
                            -------------------------------------------
                            Richard Williams, Notre Dame Dept of Sociology
                            Stata Version: 17.0 MP (2 processor)

                            EMAIL: [email protected]
                            WWW: https://www3.nd.edu/~rwilliam

                            Comment


                            • #29
                              Thank you again Richard.

                              Do you also know the solution for the other problem (which I added in my edit) which is a lot more complicated I'm afraid.

                              Richard, you were right after all.

                              Managername fundno first_year first_qtr last_year last_qtr
                              PEDRO VERDU 103 1997 4 1999 3
                              PETER VERDU 103 1999 4 1999 4

                              That results in a different alpha even though the 2 managernames are from 1 manager

                              Comment


                              • #30
                                You are sure Pedro and Peter are the same guy? I think Peter's record would drop out because he is only there a month, which causes Pedro to get shorted one month.

                                If this is the extent of the problem, I would be tempted to manually drop Peter and change Pedro's ending quarter. Be sure to keep good records if you do this or otherwise 10 years from now somebody will say they got different results and you won't be able to explain why,
                                -------------------------------------------
                                Richard Williams, Notre Dame Dept of Sociology
                                Stata Version: 17.0 MP (2 processor)

                                EMAIL: [email protected]
                                WWW: https://www3.nd.edu/~rwilliam

                                Comment

                                Working...
                                X