Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Error 134: Too many values

    Got a bit stuck with this error. I need to construct 25 portfolios rebalanced every year with all available stocks in a given year (their number is time-varying). My dataset looks like:
    Code:
    Contains data
      obs:       656,760                         
     vars:            11                         
     size:    27,583,920                         
    ----------------------------------------------------------------------------------------------------------------------------------------------------
                  storage   display    value
    variable name   type    format     label      variable label
    ----------------------------------------------------------------------------------------------------------------------------------------------------
    datem           float   %tm                  
    id              int     %9.0g                
    datey           float   %ty                  
    totvol          float   %9.0g                
    monthlyvol      float   %9.0g                
    milliq          float   %9.0g                
    montRet         float   %9.0g                
    mktret          float   %9.0g                
    avilliq         float   %9.0g                
    lnavilliq       float   %9.0g                
    L_datey         float   %9.0g                
    ----------------------------------------------------------------------------------------------------------------------------------------------------
    Sorted by: datem  id
    And the error came out after the xtile() command:

    Code:
    egen portfolio = xtile(milliq), nq(25) by(id L_datey)
    Any idea how to solve the problem?

    Thanks

    S

  • #2
    You are there not using the xtile command but the xtile() egen function from egenmore (SSC).

    Something is complaining inside that, possibly the now undocumented levels command. I think you need to show the author (Ulrich Kohler) a trace using

    Code:
    set trace on
    to see where it's failing.

    Comment


    • #3
      Thanks Nick, I'll try to get in contact with him as debugging the code is far beyond my capabilities. The strange fact is that the code works fine for smaller samples. Just to know, would you suggest another alternative to achieve the same?

      Comment


      • #4
        fastxtile (SSC) is one possibility. If you have no ties within ID L_datey milliq and no missings, then

        Code:
        bysort ID L_datey (milliq) : gen bin = ceil(25 * _n/_N)
        would be a direct solution.

        Comment


        • #5
          Thank you very much Nick. Ulrich Kohler solved my problem which was related to levels, as you also suggested.
          fastextile has the same problem as xtile, that is in cannot be combined with by. Lastly, while the code
          Code:
            
           bysort ID L_datey (milliq) : gen bin = ceil(25 * _n/_N)
          seems to work well, I found out that it does not rebalance on an annual base (L_datey), but rather stocks change portfolio from one observation to another.

          Said this, thanks again

          Stefano

          Comment


          • #6
            Sorry, I have no idea what "rebalancing" means, but the code in #4 still seems an "in principle" solution given ideal data for the purpose (no ties on boundary values and no missings).

            Comment


            • #7
              Sorry Nick. Just to be more precise, by "rebalancing" I mean that quantiles of milliq by id are determined on a given year to allocate a stock in a given portfolio (quantile). The following year the same stock might be in a different quantile according to its average level of milliq.

              Comment


              • #8
                OK, except that there is nothing in any code discussed so far in this thread that does any averaging.

                Otherwise: my code does that.

                Comment


                • #9
                  Fair enough, I meant distribution. However, either your code doesn't do what I expect, or there is another problem I'm unable to figure out. For instance, here is an extract of the dataset for only one id:

                  Code:
                  * Example generated by -dataex-. To install: ssc install dataex
                  clear
                  input float(datem datey) int id float milliq
                  459 1998 1  .004929517
                  460 1998 1  .003658046
                  461 1998 1  .010683908
                  462 1998 1  .002969889
                  463 1998 1  .003535804
                  464 1998 1 .0040777973
                  465 1998 1  .004053092
                  466 1998 1  .009262852
                  467 1998 1 .0039822254
                  468 1999 1  .002738284
                  469 1999 1  .002819034
                  470 1999 1 .0036170064
                  471 1999 1  .002637806
                  472 1999 1   .00345821
                  473 1999 1 .0032503295
                  474 1999 1 .0033479454
                  475 1999 1 .0032709274
                  476 1999 1   .00340267
                  477 1999 1  .004828549
                  478 1999 1     .010728
                  479 1999 1 .0091636805
                  480 2000 1  .007132516
                  481 2000 1  .009817912
                  482 2000 1  .005135804
                  483 2000 1  .003381297
                  484 2000 1  .005677982
                  485 2000 1  .006114594
                  486 2000 1  .004812886
                  487 2000 1 .0037518756
                  488 2000 1 .0044197883
                  489 2000 1  .004496173
                  end
                  format %tm datem
                  format %ty datey
                  After running
                  Code:
                  egen portfolio = xtile(milliq), nq(25) by(id datey)
                  Or your code
                  Code:
                  bysort id datey (milliq) : gen portfolio = ceil(25 * _n/_N)
                  I get the following:

                  Code:
                  * Example generated by -dataex-. To install: ssc install dataex
                  clear
                  input float(datem datey) int id float(portfolio milliq)
                  459 1998 1 19  .004929517
                  460 1998 1 11  .003658046
                  461 1998 1 23  .010683908
                  462 1998 1  5  .002969889
                  463 1998 1  9  .003535804
                  464 1998 1 17 .0040777973
                  465 1998 1 15  .004053092
                  466 1998 1 21  .009262852
                  467 1998 1 13 .0039822254
                  468 1999 1  3  .002738284
                  469 1999 1  5  .002819034
                  470 1999 1 17 .0036170064
                  471 1999 1  1  .002637806
                  472 1999 1 15   .00345821
                  473 1999 1  7 .0032503295
                  474 1999 1 11 .0033479454
                  475 1999 1  9 .0032709274
                  476 1999 1 13   .00340267
                  477 1999 1 19  .004828549
                  478 1999 1 23     .010728
                  479 1999 1 21 .0091636805
                  480 2000 1 21  .007132516
                  481 2000 1 23  .009817912
                  482 2000 1 15  .005135804
                  483 2000 1  1  .003381297
                  484 2000 1 17  .005677982
                  485 2000 1 19  .006114594
                  486 2000 1 13  .004812886
                  487 2000 1  3 .0037518756
                  488 2000 1  7 .0044197883
                  489 2000 1  9  .004496173
                  end
                  format %tm datem
                  format %ty datey
                  Which is not what I aim. In fact, portfolio must be constant for a given id if datey == y. For example in 1998 the id changes portfolio every month while should be in the same and then, eventually, change in 1999. I'm getting really stuck in this issue that I'm sure is easy to solve.

                  Thanks

                  Comment


                  • #10
                    So, your problem appears different from what you've previously described. You need to summarize replicates within each group. I don't see a recipe for that.

                    Comment


                    • #11
                      I quite don't get what you mean. I found this procedure in other posts to calculate portfolios with long dataset and it also seems meaningful to me. The xtile() function from egenmore, like the built in function xtile should provide the quantile distribution for my random variable as I expect. However, the double by group condition doesn't give me what I want. Do you mean a loop would be preferable?

                      Comment


                      • #12
                        You have multiple observations for each identifier and year. They won't get put in the same portfolio unless they are mapped to the same summary. This is on all fours with expecting (1 2 3 4 5) (6 7 8 9 10) to get mapped to two distinct portfolios: that will happen if you work on the minima 1 6 or the maxima 5 10 or the means 3 8, and so on.

                        Comment


                        • #13
                          Thanks for your support Nick Cox . As far as I could understand I've tried to firstly calculate the average over the year for each variable. This is also theoretically consistent (for me) in order to create portfolios:
                          Code:
                          egen avmilliq = mean(milliq), by (id datey)
                          Then I thought of using xtile() looping by year as:

                          Code:
                          local datey
                          foreach y in datey {
                          egen portfolio = xtile(avmilliq) if datey == `y', nq(25) by (id)
                          }
                          Despite it now provides portfolios to be constant over the year for each id, the problem is that they are not sorted in ascending order. In fact, I expected for example that the average level of milliq for portfolio 1 to be the smallest average every year compared to all other portfolios.

                          I'm very confused

                          Comment


                          • #14
                            That is correct syntax but not helpful syntax. The loop is no loop and reduces to

                            Code:
                            egen portfolio = xtile(avmilliq) if datey == datey, nq(25) by(id)
                            The if qualifier is redundant there. There is no sense in which your syntax will impel Stata to look inside a variable and cycle over its distinct values.

                            In turn I am fuzzy about your goals, but I can't see why you are not reaching for

                            Code:
                            egen portfolio = xtile(avmilliq), nq(25) by(id datey)

                            Comment


                            • #15
                              Just for the record, I could sort it out. The solution is really not elegant but it works:
                              Code:
                              * Step 1: average annual illiquidity for each stock
                              egen avmilliq = mean(milliq), by (id datey)
                              * Step 2: xtile
                              keep datem avmilliq datey id
                              collapse avmilliq, by (id datey)
                              egen portfolio = xtile(avmilliq), nq(25) by (datey)
                              save "\xtile_ireland.dta", replace
                              clear
                              
                              use "\lcapm_ireland.dta"
                              merge m:1 id datey using "\xtile_ireland.dta", nogenerate update
                              Thanks again for your help Nick

                              Comment

                              Working...
                              X