Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Optimize foreach multiples

    Dear,
    Does anyone know if it is possible to optimize the code below. I have a lot of data and Stata has been running for two days.
    Any help is much appreciated.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(priceclose refdate) double(b_market_return R_sd volume)
       15 20802   .29066793476086383  .013988015330073225  3544055
     14.8 20803   .28276543984284264   .01219491750502958  3593659
    14.78 20804   .34550501471407213  .016951893952404908  3948176
     14.5 20807    .3569666121632186  .014193321115454396  7608092
    14.42 20808   .41033263291208183  .008289172247704618  7331540
    14.24 20809    .4071867581801756  .005880452306285766  4892042
     14.2 20810   .47975600582660327  .005141142125564504  1852394
    13.95 20811    .4804865948130846  .006039682463614134  3722272
    14.15 20814   .39432355608943237  .008581557936344957   760866
       14 20815   .34805042305693157  .012661447309070867  2335769
     14.2 20816   .19647173339067603  .012782552419999496  3512804
    14.65 20817   .20951045672745083  .014454574234970602  5375545
     14.6 20821   .17536064993683237  .020627551147663712   523505
     14.9 20822   .27468024199285185  .015254865212670707  5314402
     14.9 20823     .326094447993363  .015279074757932377  1746750
    15.35 20824   .32414281099897674  .012214255774921074  1885899
    15.45 20825   .37876834954690003    .0144741707788485  2292524
    15.21 20828    .3696024330726629   .01126381340351348  2641429
    14.89 20829    .3803754556939234   .01563032157358425  3740701
     14.9 20830    .3636342129930175  .020275975822514974  1985513
       15 20831    .4995529260898174  .020255300592497492  5064840
       15 20832    .5228324917715168  .012812688541158305  3188433
     14.8 20835    .5282910537358131  .010943169527778355  1567721
     14.8 20836   .21849080185365488  .010779988312457908  3445879
    15.02 20837    .2470317201273872  .006596140333483685  2452963
    14.75 20838   .21585794824774904  .011105348633794884  2526647
    14.45 20839   .29785596263358194   .01346103352206779  5036959
    14.48 20842   .26666250730532176  .015474810594547268  2334028
    14.39 20843   .26853212173472246   .01514563437012279  2359199
    14.29 20845    .3009824758101083   .01496600425340675  2455622
    14.25 20846     .458598816941453  .008330564912019193  1881958
    14.03 20849   .46040219898273765   .00861741341790695  2063672
    13.94 20850   .48438349193270486 .0032341649457775135  3116406
     13.8 20851    .4810116789118811 .0032872887735649666  2282779
    13.67 20852    .4841844382791233   .00410631357757996  1284404
    12.95 20853    .4609813225140847  .004011783740343337  7398656
    12.85 20856   .44062785200119414  .022084958315610418  4600558
    13.45 20857    .4189429545939568   .02244714485552819  6689002
     13.4 20858   .42565707758719556  .035910562760638094  2083157
    13.25 20859   .45798107893776363   .03582941971629399  3097053
    13.47 20860    .4360804898303578    .0359444920718639  1990205
     13.5 20863   .47225593518480435   .02262615238617526  7073517
    13.61 20864   .43957437408510136  .022866778355754874  1572580
    13.48 20865   .38561266079730666  .009895090200511152  2121924
    13.27 20866   .35474529661402987  .012021590384909013  3676864
    13.05 20867    .2475333361802396  .012598893445692993  2080454
    13.12 20870    .2756558026936728  .011544083046168132  1948521
    12.99 20871    .2465977879596939   .01240292595681369  1518766
    12.91 20872    .2867390681398278    .0079187115973458   460138
    13.07 20873    .2703304225583192  .007832551972041423  2037568
    12.74 20874   .16312888837085368  .013045883689830742  3077805
    12.89 20879   .23250438762069464  .015161504908064333   790201
     12.7 20880    .2027122831601531  .016446748863171136  4098878
    12.51 20881    .2455137892085358  .016386651044092877  3096044
    12.09 20884     .203903012870287  .017958870989684843  7934645
    12.67 20885   .25272492787782963   .01636070043962157  9292211
    12.99 20886   .12881707645885598   .03191853380059886  2841846
       13 20887 -.013394209947376914   .03411067605385401   884407
    12.98 20888   .02098943898773726   .03328388354685598  3247345
    13.02 20891 -.005830506705509646  .031077878443743386   907428
    13.21 20892   .01925243163016297   .02198950022114325  5383624
     13.3 20893  -.03108240309517214  .011954869733385066  2974842
    13.25 20894  .022206566738478636   .00643802863109005   313879
    12.99 20895  -.06832031590323122  .007367853834598363  1288304
    12.98 20898  .047244783594704395  .013774439789180616   485879
    13.01 20899   .05415162542812564   .01366754444275949  2340907
       13 20900   .01720377785807472  .011229383358588415  1725872
    13.99 20901   .08305654790054999  .009805650157792296  5796468
    14.65 20902   .06949837501688363  .036378310112247944  6941832
    14.88 20905   .08558179480112728   .03382277061661836  5517621
    14.99 20906   .11799218437223138   .03146548813384615   929284
    14.84 20907   .12779004317489584  .030952360243009472  5526250
     14.8 20908  .046259784564860185    .0337378259890945  5236011
    15.63 20909   .05421697126557225   .02203283991840497  3141709
    15.45 20912  .019950690088320358  .025702592912042323  7168180
    15.57 20913   .06793864376806422    .0277311690462081  8630483
     15.6 20914   .06725122516873792  .027730525704845906  3154084
     15.3 20915   .07599330100483266  .025936364570505546  4213237
    15.39 20916   .11488235042274716    .0287798630848395  1731907
    15.58 20919   .14896138991506042   .01151089703107813   963348
    15.43 20920     .141187674563591  .011961288611298138  3976290
    15.65 20921   .18084219321761702  .012313095251113848  6234636
    16.37 20922   .09493844082407808  .014530736719363523  1438482
    16.43 20926  -.04440737394030564   .02046961882292768  1372042
    16.52 20927  -.13787526512463116   .02052817942830347  1282754
    16.46 20928   -.0657022258280206   .02093183081515904  1900364
    16.39 20929  -.08505886641246499  .019824364346061253  2531215
    16.54 20933  -.04134260137488695   .02116934468628189  1982606
    16.49 20934  .016320933415600557  .006079998589964183  4337906
    16.25 20935  -.01424003957297813  .006290710277880814  3446965
    16.85 20936  .009625391430747608  .008602491710380687  3714977
    18.58 20937  .007101379351991996  .019580786521099413  6444929
    17.89 20941    .2416158033485712   .04478107104782573  8166194
       18 20942   .04972343915172623   .05434769975143652  2074505
    17.98 20943   .03538985597489168   .05371224230420924   889059
       18 20944  -.08797672552168076   .05197616842929916  5331210
       18 20947  -.08799829050509456  .051458006762289975  1902268
    17.99 20948   -.1780883343860402    .0201472988525531  3068821
       18 20949  -.18455008004814724 .0032341649457775135  5064161
    17.94 20950   -.1825259697624504 .0032341649457775135 10171311
    end
    format %td refdate
    
    *1) Volume_5
    levelsof refdate, local(refdate)
    tempvar temp_quintile
    gen int q_volume = .
      foreach y of local refdate {
            quietly count if refdate == `y'
            if `r(N)' >= 4 {
                xtile `temp_quintile' = volume if refdate == `y', nq(5)
                replace q_volume = `temp_quintile' if refdate == `y'
                drop `temp_quintile'
            }
            else {
                display "Insufficient data to compute quintiles for refdate `y'"
            }
        }
    
    
    *2) Volume_4
    levelsof refdate, local(refdate)
    tempvar temp_quartil
    gen int q4_volume = .
      foreach y of local refdate {
            quietly count if refdate == `y'
            if `r(N)' >= 3 {
                xtile `temp_quartil' = volume if refdate == `y', nq(4)
                replace q4_volume = `temp_quartil' if refdate == `y'
                drop `temp_quartil'
            }
            else {
                display "Insufficient data to compute quartil for refdate `y'"
            }
        }
        
    *3) b_market_return_5
    levelsof refdate, local(refdate)
    tempvar temp_quintile
    gen int q_b = .
      foreach y of local refdate {
            quietly count if refdate == `y'
            if `r(N)' >= 4 {
                xtile `temp_quintile' = b_market_return if refdate == `y', nq(5)
                replace q_b = `temp_quintile' if refdate == `y'
                drop `temp_quintile'
            }
            else {
                display "Insufficient data to compute quintiles for refdate `y'"
            }
        }
    
    *4) b_market_return_4
    levelsof refdate, local(refdate)
    tempvar temp_quartil
    gen int q4_b = .
      foreach y of local refdate {
            quietly count if refdate == `y'
            if `r(N)' >= 3 {
                xtile `temp_quartil' = b_market_return if refdate == `y', nq(4)
                replace q4_b = `temp_quartil' if refdate == `y'
                drop `temp_quartil'
            }
            else {
                display "Insufficient data to compute quartil for refdate `y'"
            }
        }
    
    *5) R_sd_5
    levelsof refdate, local(refdate)
    tempvar temp_quintile
    gen int q_R_sd = .
      foreach y of local refdate {
            quietly count if refdate == `y'
            if `r(N)' >= 4 {
                xtile `temp_quintile' = R_sd if refdate == `y', nq(5)
                replace q_R_sd = `temp_quintile' if refdate == `y'
                drop `temp_quintile'
            }
            else {
                display "Insufficient data to compute quintiles for refdate `y'"
            }
        }
    
    *6) R_sd_4
    levelsof refdate, local(refdate)
    tempvar temp_quartil
    gen int q4_R_sd = .
      foreach y of local refdate {
            quietly count if refdate == `y'
            if `r(N)' >= 3 {
                xtile `temp_quartil' = R_sd if refdate == `y', nq(4)
                replace q4_R_sd = `temp_quartil' if refdate == `y'
                drop `temp_quartil'
            }
            else {
                display "Insufficient data to compute quartil for refdate `y'"
            }
        }
    
    *7) priceclose_5
    levelsof refdate, local(refdate)
    tempvar temp_quintile
    gen int q_priceclose = .
      foreach y of local refdate {
            quietly count if refdate == `y'
            if `r(N)' >= 4 {
                xtile `temp_quintile' = priceclose if refdate == `y', nq(5)
                replace q_priceclose = `temp_quintile' if refdate == `y'
                drop `temp_quintile'
            }
            else {
                display "Insufficient data to compute quintiles for year `y'"
            }
        }
    
    *8) priceclose_4
    levelsof refdate, local(refdate)
    tempvar temp_quartil
    gen int q4_priceclose = .
      foreach y of local refdate {
            quietly count if refdate == `y'
            if `r(N)' >= 3 {
                xtile `temp_quartil' = priceclose if refdate == `y', nq(4)
                replace q4_priceclose = `temp_quartil' if refdate == `y'
                drop `temp_quartil'
            }
            else {
                display "Insufficient data to compute quartil for refdate `y'"
            }
        }

  • #2
    The slowness of your process comes from two sources. One of them is that you are computing quantiles, which requires sorting, and sorting is slow. There is basically nothing you can do about that part. The other is that you are looping over values of refdate and selecting observations with -if refdate ==...- clauses. That requires a fall pass over the entire data set every time to identify the observations to be included. That is very slow as well, but is avoidable by using the -runby- command. -runby- is written by Robert Picard and me, and is avaliable from SSC.

    Code:
    capture program drop one_refdate
    program define one_refdate
        gen messages = ""
        if _N > = 4 {
            xtile q_volume = volume, nq(5)
            xtile q_R_sd = R_sd, nq(5)
            xtile q_priceclose = priceclose, nq(5)
        }
        else {
            replace messages = "Insufficient data to compute quintiles for refdate `=refdate[1]'"
        }
        
        if _N >= 3 {
            xtile q4_b = b_market_return, nq(4)
            xtile q4_price_close = priceclose, nq(4)
        }
        else {
            replace messages = messages + " Insufficient data to compute quartiles for refdate `=refdate[1]'"
        }
        exit
    end
    
    runby one_refdate, by(refdate) status
    This will still be slow, but much less slow than what you have been working with. The -status- option will cause Stata to give you a progress report at periodic intervals so, even though this will be slow due to all the sorting required, you will at least have a sense of making progress and how far you still have to go. It makes it easier to be patient.
    Rather than putting the warnings about insufficient data onto the Results window, this code creates a new variable, messages.

    That said, I notice that you are computing quintiles when you have 4 or more observations, and quartiles when you have 3 or more. That seems odd. Are you sure that's what you meant to do? Changing those to 5 and 4, respectively strikes me as more sensible.

    In your example data, refdate uniquely identifies observations, so the results are nothing but messages showing insufficient data to calculate the quintiles/quartiles. Presumably in your real data set, at least some of the refdates have enough observations to produce quintiles or quartiles.

    Comment


    • #3
      nuuuu, it took 32 seconds.
      Clyde Schechter, thank you very much!

      Professor, taking advantage, do you know if the foreach below can be optimized? This is the last bottleneck I have.

      Greetings,

      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input float(P refdate id P_1 R REP)
         19 21916 1 18.16     .04707982 1
         19 21917 1    19    .005858855 .
      18.99 21920 1    19     .00491426 .
      18.94 21921 1 18.99  -.0012881233 .
      18.97 21922 1 18.94    .004243572 .
       19.9 21923 1 18.97     .05089471 .
      20.95 21924 1  19.9     .05474658 .
      21.88 21927 1 20.95     .04012544 .
      21.85 21928 1 21.88   -.002994639 .
      21.73 21929 1 21.85   .0006129141 .
      21.99 21930 1 21.73    .010262501 .
      21.95 21931 1 21.99   -.012112373 .
      22.84 21934 1 21.95     .03846537 .
       22.9 21935 1 22.84     .01240793 .
      22.85 21936 1  22.9   -.009539948 .
      22.89 21937 1 22.85   -.003187001 .
      23.15 21938 1 22.89    .015846029 .
       22.3 21941 1 23.15   -.014308983 .
      22.13 21942 1  22.3   -.024309207 .
      22.12 21943 1 22.13    .006762626 .
      21.68 21944 1 22.12     -.0208939 .
      21.02 21945 1 21.68    -.01868336 .
         21 21948 1 21.02   -.006503299 .
      21.53 21949 1    21    .019670947 1
      21.25 21950 1 21.53   -.015994683 .
      21.43 21951 1 21.25     .01341905 .
      20.98 21952 1 21.43   -.012735788 .
      20.49 21955 1 20.98    -.01538253 .
       20.6 21956 1 20.49   -.015529143 .
       21.5 21957 1  20.6    .035204116 1
       21.3 21958 1  21.5  -.0019547553 .
       21.5 21959 1  21.3     .01744436 .
       21.5 21962 1  21.5   -.005544502 .
       21.5 21963 1  21.5   .0020261218 .
      21.55 21964 1  21.5   -.007274819 .
      20.75 21965 1 21.55    -.02561737 .
      20.74 21966 1 20.75    .005335628 .
      10.64 22014 1  9.79      .0969252 .
      10.02 22018 1 10.64    -.07138694 .
       10.9 22019 1 10.02     .07478163 .
      10.21 22020 1  10.9    -.04994936 .
         11 22021 1 10.21     .09015866 .
      10.91 22022 1    11    -.02296271 .
      10.76 22025 1 10.91   -.013646903 .
       11.2 22027 1 10.76     .02550047 .
      11.16 22028 1  11.2    .005586482 .
      10.29 22029 1 11.16    -.04033759 .
      10.63 22032 1 10.29    -.01980008 .
      10.33 22033 1 10.63    -.07797142 .
       10.5 22034 1 10.33   -.009036668 .
      10.49 22035 1  10.5    .035549592 .
       9.75 22039 1 10.49    -.05169155 .
       9.52 22040 1  9.75   -.029973563 .
        9.6 22041 1  9.52    .012722185 .
       9.24 22042 1   9.6    -.02744832 .
       9.25 22043 1  9.24     -.0203967 .
       8.66 22046 1  9.25     -.0526192 .
        8.4 22047 1  8.66   -.017730538 .
       8.39 22048 1   8.4 -.00015715975 .
       8.44 22049 1  8.39   -.007001305 .
        8.7 22050 1  8.44     .04755703 1
          9 22053 1   8.7   -.000590343 .
        8.7 22054 1     9    -.02882357 .
        8.6 22055 1   8.7   -.017361648 .
       8.68 22056 1   8.6   -.007732635 .
          9 22057 1  8.68     .04536519 .
        9.2 22060 1     9    -.00846527 .
       9.07 22061 1   9.2   -.012824992 .
          9 22062 1  9.07    -.02487629 .
       9.77 22063 1     9     .09198876 1
       10.2 22064 1  9.77     .04215277 .
       10.5 22067 1  10.2    .021661766 .
      10.26 22068 1  10.5    -.03875425 .
      end
      format %td refdate
      
      sort refdate
      
      levelsof refdate if REP==1, local(repdates)
      gen wanted1=.
      foreach date of local repdates{
          gen group= refdate>=`date'
          bys group (refdate): gen rep= P_1[1] if group
          bys group (refdate): egen counter= total(cond(!sum(P<=rep) , 1, .)) if group
          egen N= total(group) if group
          replace counter=. if counter==N 
          bys group (refdate): replace wanted1= counter if _n==1 & group
          drop group rep counter N
      }

      Comment


      • #4
        Again, the major slowdown here is coming from all those -if- conditions. -runby- can't help you with this one, however, since your problem does not correspond to chunks of the data identified by a single value of a variable. However, -rangerun- will do the trick. The following code produces the same results as yours in the example data.

        Code:
        capture program drop one_repdate
        program define one_repdate
            local rep = P_1[1]
            egen counter = total(cond(!sum(P <= `rep'), 1, .))
            count
            gen wanted1 = cond(counter == r(N), ., counter)
            keep in 1
            drop counter
            exit
        end
        
        summ refdate, meanonly
        gen low = cond(REP == 1, refdate, r(max)+1)
        sort refdate
        rangerun one_repdate, interval(refdate low .)
        drop low
        -rangerun- is by Robert Picard and Nick Cox, and is available from SSC. To use it, you will also need to install -rangestat-, by Robert Picard, Nick Cox, and Roberto Ferrer, also available from SSC.

        Comment


        • #5
          Actually, the above can be made a bit faster. Those -cond()- expressions, just like -if- conditions, require checking every blasted observation--both can be eliminated. And -egen- has some overhead that can be avoided by directly computing the desired total.

          Code:
          capture program drop one_repdate
          program define one_repdate
              local rep = P_1[1]
              gen counter = sum(sum(P <= `rep') == 0)
              count
              if counter[_N] < r(N) {
                  gen wanted2 = counter[_N]
              }
              keep in 1
              drop counter
              exit
          end
          
          summ refdate, meanonly
          gen low = cond(REP == 1, refdate, r(max)+1)
          sort refdate
          rangerun one_repdate, interval(refdate low .)
          Note that the -if- command used, unlike -if- conditions, does not require verifying a condition in every observation in the data set: it only has to look at a single observation.

          I don't see anything more I can squeeze out of it than that. But this should give you a decent speedup.
          Last edited by Clyde Schechter; 17 Aug 2021, 17:47.

          Comment


          • #6
            Clyde Schechter, thank you very much!
            See if you can help me.
            The code worked perfectly, but only for the first id.
            In dataex there are only the first 100 data, I made several cuts with three id’s to demonstrate that the correct return occurred only for id == 1
            On the full base I have 481 id’s. I thought to put the code in a foreach, but I can't do.
            Greetings,

            Code:
            * Example generated by -dataex-. For more info, type help dataex
            clear
            input float(P refdate id P_1 R REP)
            11.03 22242 1 10.74   .016292192 1
            10.73 22243 1 11.03   -.05621599 .
            11.04 22244 1 10.73   .024714407 .
            11.19 22245 1 11.04   .012451226 .
            11.15 22246 1 11.19   -.00749675 .
            10.84 22249 1 11.15 -.0090376735 .
            11.15 22250 1 10.84  .0004549939 .
            11.61 22251 1 11.15   .035054456 1
            11.75 22252 1 11.61 -.0003610458 .
            12.29 22253 1 11.75    .03639867 .
            12.14 22256 1 12.29  -.011618525 .
            12.16 22257 1 12.14   .002286506 .
            12.07 22258 1 12.16  .0023159697 .
            11.98 22259 1 12.07  -.033293135 .
            12.18 22260 1 11.98   .012949167 .
            11.79 22263 1 12.18   -.02929007 .
            12.05 22264 1 11.79    .01057351 .
            12.01 22265 1 12.05   -.01888214 .
            11.75 22266 1 12.01    -.0234871 .
            11.43 22267 1 11.75    -.0240839 .
            11.22 22270 1 11.43   -.00585081 .
            10.92 22271 1 11.22   -.02976402 .
            11.05 22272 1 10.92  -.002891958 .
            11.01 22277 1 11.05   -.01538526 .
            11.09 22278 1 11.01   .003286043 .
             11.1 22279 1 11.09   .002383526 .
             20.3 21916 2 20.19  -.006655232 .
            20.75 21917 2  20.3    .02461132 1
            20.62 21920 2 20.75  -.004716454 .
            20.36 21921 2 20.62   -.01230695 .
             20.3 21922 2 20.36 -.0020519367 .
             20.3 21923 2  20.3  .0007748115 .
            20.28 21924 2  20.3  .0003859871 .
            20.58 21927 2 20.28   .007499131 1
            20.39 21928 2 20.58  -.010701385 .
            20.09 21929 2 20.39   -.01190456 .
            20.25 21930 2 20.09   .006830978 .
            21.56 21931 2 20.25    .05657059 1
            21.99 21934 2 21.56   .017264176 .
            22.51 21935 2 21.99    .03543316 .
            22.38 21936 2 22.51    -.0118188 .
            23.16 21937 2 22.38    .02987197 .
            23.03 21938 2 23.16 -.0006094933 .
             22.5 21941 2 23.03 -.0040085316 .
            22.69 21942 2  22.5  -.003366231 .
            22.35 21943 2 22.69  -.008987141 .
               22 21944 2 22.35  -.016693512 .
            21.71 21945 2    22 -.0032713125 .
            22.75 21948 2 21.71    .04152717 1
            22.82 21949 2 22.75  -.003014174 .
            23.75 21950 2 22.82    .03691486 .
            22.52 21951 2 23.75   -.04724066 .
            21.95 21952 2 22.52  -.014325877 .
            21.24 21955 2 21.95   -.02264012 .
            21.55 21956 2 21.24  -.010892853 .
            21.68 21957 2 21.55  -.005114632 .
             19.2 21916 3 18.67   .009695493 .
            18.93 21917 3  19.2  -.007411063 .
            19.02 21920 3 18.93   .011884474 .
            19.06 21921 3 19.02  .0038471846 .
            18.97 21922 3 19.06 -.0013196794 .
            18.82 21923 3 18.97  -.005271001 .
            18.89 21924 3 18.82   .007543945 .
            19.21 21927 3 18.89  .0013500303 .
             19.1 21928 3 19.21  -.008383229 .
            18.76 21929 3  19.1  -.006980318 .
             18.7 21930 3 18.76   -.00595116 .
             18.7 21931 3  18.7  -.016800277 .
             18.8 21934 3  18.7  .0020174598 .
            18.74 21935 3  18.8   .013208578 1
            19.05 21936 3 18.74   .005596512 .
            18.65 21937 3 19.05  -.029830156 .
            18.99 21938 3 18.65   .025880156 .
            18.55 21941 3 18.99  -.001291873 .
             18.8 21942 3 18.55  .0017064027 .
             18.3 21943 3  18.8   -.02059714 .
            18.34 21944 3  18.3  .0012300746 .
            17.85 21945 3 18.34   -.01609341 .
            17.59 21948 3 17.85  -.020501453 .
            17.65 21949 3 17.59  -.002673582 .
            17.25 21950 3 17.65   -.02607061 .
            16.87 21951 3 17.25  -.016957073 .
            16.92 21952 3 16.87   .012338043 1
            16.73 21955 3 16.92 -.0037921066 .
             16.8 21956 3 16.73  -.013329813 .
            16.65 21957 3  16.8  -.016150158 .
            16.27 21958 3 16.65  -.017824914 .
            16.41 21959 3 16.27     .0153726 .
             16.6 21962 3 16.41   .007609781 .
            16.18 21963 3  16.6   -.02430495 .
            16.13 21964 3 16.18  -.010042328 .
            16.17 21965 3 16.13    .01083171 .
            16.25 21966 3 16.17   .008426486 .
            15.82 21971 3 16.25   -.02690983 .
             14.5 21972 3 15.82   -.04422693 .
            14.54 21973 3  14.5  -.005056805 .
            14.72 21976 3 14.54 -.0035387166 .
            14.74 21977 3 14.72   .008287676 .
            15.31 21978 3 14.74   .027207334 .
            15.23 21979 3 15.31    .02911236 .
            end
            format %td refdate
            
            capture program drop one_repdate
            program define one_repdate
                local rep = P_1[1]
                egen counter = total(cond(!sum(P <= `rep'), 1, .))
                count
                gen wanted3 = cond(counter == r(N), ., counter)
                keep in 1
                drop counter
                exit
            end
            
            summ refdate, meanonly
            gen low = cond(REP == 1, refdate, r(max)+1)
            sort refdate
            rangerun one_repdate, interval(refdate low .)
            drop low
            
            sort id refdate

            Comment


            • #7
              In your original statement of the problem, you said nothing that would even suggest that id has anything to do with it. Even your original code makes no mention of id. My code was written on the assumption that the id variable is irrelevant and that everything is to be calculated on the entire sample: only refdate would determine which observations are used where. Apparently you want to do it for each id separately.

              Fortunately, the fix is very simple. The code in #5 requires only a tiny modification in the last line:
              Code:
              rangerun one_repdate, interval(refdate low .) by(id)

              Comment


              • #8
                Professor, thank you very much!

                Comment

                Working...
                X