Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Looping Over the numeric value of a variable and use it for the subsequent loop.

    Hi, thanks for your help in advance.
    I have the following dataset. And I come up with the following code:

    Code:
    egen t=max(dateday_count)
    forvalues i=1/762{
    gen k_`i'=.
    gen ref_`i'=.
    }
    
    levelsof dateday_count, local(levels) 
    foreach i of local levels {
            di(`i')
            quietly{
            forvalue x=1/`i'{
            bys permno (dateday): replace k_`x'=adjvol[_n-`x']*exp(total_log_rev_turnover-total_log_rev_turnover[_n-`x'+1]) if dateday_count==`i'
            by permno: replace ref_`x' = adjprc[_n-`x']*k_`x' if dateday_count==`i'
            }
    }
    }

    In words, what I am trying to do is, looping over the value of variable dateday_count, and for each row, I want to do
    Code:
    forvalue x=1/"dateday_count"{
    bys permno (dateday): replace k_`x'=adjvol[_n-`x']*exp(total_log_rev_turnover-total_log_rev_turnover[_n-`x'+1]) 
    by permno: replace ref_`x' = adjprc[_n-`x']*k_`x'
    }
    More specifically, for row 12 for permno 10001, I want to do
    Code:
    forvalue x=1/12{
    bys permno (dateday): replace k_`x'=adjvol[_n-`x']*exp(total_log_rev_turnover-total_log_rev_turnover[_n-`x'+1]) 
    by permno: replace ref_`x' = adjprc[_n-`x']*k_`x'
    }
    I am moderately confident that my code is doing the job. However, I am wondering, is there any faster way for me to loop over the numeric value of dateday_count straightaway, and choose it as the looping variable. So I do not have to use "if dateaday_cout==`i'" command.



    In the dataex, I only attached the sample from single permno, you need to remove the "bys permno" part of my code.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input double(permno dateday) float(adjvol adjprc total_log_rev_turnover) double dateday_count
    10001 10595     0 2.1041667          0   1
    10001 10596  7200 2.0833333          0   2
    10001 10597  6150 2.1666667          0   3
    10001 10598  2100 2.0416667 -1.2086598   4
    10001 10601   600 2.1666667 -1.4323044   5
    10001 10602 17250 2.1666667 -1.4323044   6
    10001 10603  1500 2.0833333 -2.1274576   7
    10001 10604   330 2.0833333  -2.244239   8
    10001 10605  1500 2.1666667  -2.939392   9
    10001 10608   300 2.1666667 -3.0449755  10
    10001 10609  3000 2.0833333 -3.0449755  11
    10001 10610  1500 2.1666667  -3.740129  12
    10001 10611     0     2.125  -3.740129  13
    10001 10612  1500 2.0833333  -4.435282  14
    10001 10615   600 2.1666667  -4.658927  15
    10001 10616  2982 2.0833333  -10.17839  16
    10001 10617   390 2.0833333 -10.317952  17
    10001 10618  3300 2.1666667 -10.317952  18
    10001 10619     0     2.125 -10.317952  19
    10001 10622 17700     2.125 -10.317952  20
    10001 10623  7575 2.1666667 -10.317952  21
    10001 10624   600 2.1666667  -.2236447  22
    10001 10625  6900 2.1666667  -.2236447  23
    10001 10626 13500 2.1666667  -.2236447  24
    10001 10629 17100 2.1666667  -.2236447  25
    10001 10630  5400 2.0833333  -.2236447  26
    10001 10631 16200     2.125  -.2236447  27
    10001 10632  6000     2.125  -.2236447  28
    10001 10633     0     2.125  -.2236447  29
    10001 10636  3600 2.0833333  -.2236447  30
    10001 10637  3210 2.1666667  -.2236447  31
    10001 10638  2100 2.1666667 -1.4323044  32
    10001 10639 10080 2.1666667 -1.4323044  33
    10001 10640 12816     2.125 -1.4323044  34
    10001 10644  8400 2.1666667 -1.4323044  35
    10001 10645 12000 2.1666667 -1.4323044  36
    10001 10646     0 2.2083333 -1.4323044  37
    10001 10647  3000 2.1666667 -1.4323044  38
    10001 10650   900      2.25 -1.7898387  39
    10001 10651  3300      2.25 -1.7898387  40
    10001 10652  3900      2.25          0  41
    10001 10653  1200     2.125  -.5121625  42
    10001 10654  1350      2.25 -1.1116405  43
    10001 10657  3000      2.25 -1.1116405  44
    10001 10658 11400      2.25 -1.1116405  45
    10001 10659   900      2.25 -1.4691747  46
    10001 10660  7800      2.25 -1.4691747  47
    10001 10661   450 2.2916667 -1.6320474  48
    10001 10664  3600 2.1666667 -1.6320474  49
    10001 10665  2550 2.1666667 -3.5405884  50
    10001 10666   300 2.2916667 -3.6461716  51
    10001 10667  4800 2.2916667 -3.6461716  52
    10001 10668  1200 2.2083333 -4.1583343  53
    10001 10671  4800 2.2083333 -4.1583343  54
    10001 10672 13500 2.2083333 -4.1583343  55
    10001 10673  5160 2.3333333 -4.1583343  56
    10001 10674  2700      2.25   -6.47912  57
    10001 10678  1500 2.3333333  -7.174273  58
    10001 10679   600 2.3333333  -7.397918  59
    10001 10680     0 2.2916667  -7.397918  60
    10001 10681     0 2.2916667  -7.397918  61
    10001 10682  1500      2.25  -8.093071  62
    10001 10685     0 2.2916667          0  63
    10001 10686  4950 2.3333333          0  64
    10001 10687   900 2.3333333  -.3575342  65
    10001 10688  5400 2.3333333  -.3575342  66
    10001 10689  1050      2.25  -.7893968  67
    10001 10692  4650 2.3333333  -.7893968  68
    10001 10693  5400      2.25  -.7893968  69
    10001 10694  1500      2.25   -1.48455  70
    10001 10695  3900      2.25   -1.48455  71
    10001 10696 10800  2.416667   -1.48455  72
    10001 10699  9000  2.416667   -1.48455  73
    10001 10700  5940 2.2916667   -1.48455  74
    10001 10701  1980 2.2916667 -2.5672574  75
    10001 10702  1500  2.416667 -3.2624106  76
    10001 10703  1500  2.416667  -3.957564  77
    10001 10706  2100  2.416667  -5.166224  78
    10001 10707  5400       2.5  -5.166224  79
    10001 10708  2400       2.5   -6.78371  80
    10001 10709  1500       2.5  -7.478863  81
    10001 10710 23250  2.416667  -7.478863  82
    10001 10713 14910     2.375          0  83
    10001 10714  7800  2.416667          0  84
    10001 10715 84111       2.5          0  85
    10001 10716 19350  2.416667          0  86
    10001 10717     0  2.458333          0  87
    10001 10720  8550       2.5          0  88
    10001 10721  3900  2.416667          0  89
    10001 10722 17499     2.375          0  90
    10001 10723   900     2.375  -.3575342  91
    10001 10724  3000  2.416667  -.3575342  92
    10001 10727   900       2.5  -.7150685  93
    10001 10728  2700  2.458333  -3.035854  94
    10001 10729     0  2.395833  -3.035854  95
    10001 10730  3900 2.3333333  -3.035854  96
    10001 10731   600  2.416667  -3.259499  97
    10001 10734     0     2.375  -3.259499  98
    10001 10735  9600 2.3333333  -3.259499  99
    10001 10736   600  2.416667  -3.483144 100
    end
    format %d dateday

  • #2
    You can do this with -runby-:

    Code:
    egen t=max(dateday_count)
    forvalues i=1/762{
        gen k_`i'=.
        gen ref_`i'=.
    }
    
    capture program drop one_dateday_count
    program define one_dateday_count  
        local i = dateday_count[1]
        forvalue x=1/`i'{
        bys permno (dateday): replace k_`x'=adjvol[_n-`x']* ///
            exp(total_log_rev_turnover-total_log_rev_turnover[_n-`x'+1]) 
        by permno: replace ref_`x' = adjprc[_n-`x']*k_`x'
        exit
    end
    
    runby one_dateday_count, by(dateday_count) status
    -runby- is written by Robert Picard and me, and is available from SSC.

    If your data set is large, you will find it runs dramatically faster than code using -levelsof- and -foreach ... if...-.

    Comment


    • #3
      Originally posted by Clyde Schechter View Post
      You can do this with -runby-:

      Code:
      egen t=max(dateday_count)
      forvalues i=1/762{
      gen k_`i'=.
      gen ref_`i'=.
      }
      
      capture program drop one_dateday_count
      program define one_dateday_count
      local i = dateday_count[1]
      forvalue x=1/`i'{
      bys permno (dateday): replace k_`x'=adjvol[_n-`x']* ///
      exp(total_log_rev_turnover-total_log_rev_turnover[_n-`x'+1])
      by permno: replace ref_`x' = adjprc[_n-`x']*k_`x'
      exit
      end
      
      runby one_dateday_count, by(dateday_count) status
      -runby- is written by Robert Picard and me, and is available from SSC.

      If your data set is large, you will find it runs dramatically faster than code using -levelsof- and -foreach ... if...-.
      Dear Clyde
      For some reasons the code is not working. I modified it by a bit, you should be able to copy paste and execute it straightway.
      Code:
       forvalues i=1/762{    
      gen k_`i'=.    
      gen ref_`i'=.
      }
      capture program drop one_dateday_count
      program define one_dateday_count      
      local i = dateday_count[1]    
      forvalue x=1/`i'{    
      replace k_`x'=adjvol[_n-`x']*exp(total_log_rev_turnover-total_log_rev_turnover[_n-`x'+1])      
      replace ref_`x' = adjprc[_n-`x']*k_`x'    
      }
      end
      runby one_dateday_count, by(dateday_count) status
      Can you help me with that? Any help is greatly appreciated,
      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input double dateday_count float(adjprc adjvol total_log_rev_turnover)
      253    3.375  5700          0
      254 3.333333  3900          0
      255 3.333333  3900          0
      256 3.354167     0          0
      257    3.375  2700 -2.1254954
      258    3.375   600  -2.343272
      259    3.375  8100  -2.343272
      260    3.375   900 -2.6907635
      261    3.375   300  -2.793735
      262 3.291667  6000  -2.793735
      263    3.375  3000   -6.63221
      264    3.375  3300   -6.63221
      265 3.333333  6000   -6.63221
      266 3.333333  1290  -7.178221
      267 3.333333   300  -7.281192
      268 3.291667 15000  -7.281192
      269 3.333333  5700  -7.281192
      270 3.333333   300  -7.384164
      271 3.291667 22800  -7.384164
      272 3.333333 10500  -7.384164
      273 3.291667  5475  -7.384164
      274   3.3125     0  -7.384164
      275 3.291667  3000 -3.8384755
      276 3.333333  9000 -3.8384755
      277 3.291667  4200 -3.8384755
      278 3.333333   300  -3.941447
      279   3.3125     0  -3.941447
      280   3.3125     0  -3.941447
      281 3.291667  1500  -4.613296
      282 3.291667  6000  -4.613296
      283 3.333333   750  -4.893828
      284 3.291667     0  -4.893828
      285 3.333333  6000  -4.893828
      286 3.333333  4500  -4.893828
      287     3.25  2400  -6.420668
      288 3.333333  1500  -7.092517
      289 3.333333  2100  -8.247482
      290     3.25  2736  -10.47652
      291     3.25   300  -10.57949
      292 3.333333   300 -10.682462
      293 3.291667     0 -10.682462
      294 3.333333  5571          0
      295 3.291667  5100          0
      296    3.375   900 -.34749165
      297   3.3125     0 -.34749165
      298    3.375  5460 -.34749165
      299    3.375   300  -.4504632
      300     3.25   900  -.7979549
      301     3.25  4500  -.7979549
      302    3.375  1200 -1.2945315
      303     3.25  3975 -1.2945315
      304 3.333333  1500 -1.9663807
      305   3.3125     0 -1.9663807
      306    3.375  1425  -2.591449
      307     3.25  3150  -2.591449
      308    3.375   300   -2.69442
      309   3.3125     0   -2.69442
      310     3.25  1500 -3.3662696
      311    3.375  1500  -4.038119
      312    3.375   600  -4.255895
      313    3.375   300 -4.3588667
      314   3.3125     0 -4.3588667
      315 3.291667     0 -4.3588667
      316 3.291667     0          0
      317 3.291667     0          0
      318     3.25  3750          0
      319 3.333333   900  -.3454707
      320 3.333333 19380  -.3454707
      321 3.291667     0  -.3454707
      322 3.291667     0  -.3454707
      323 3.333333  3300  -.3454707
      324 3.333333  3300  -.3454707
      325     3.25  1500 -1.0126674
      326 3.333333 12600 -1.0126674
      327 3.333333  3000  -4.651227
      328 3.291667     0  -4.651227
      329     3.25   600   -4.86782
      330 3.333333  1200   -5.36127
      331 3.291667     0   -5.36127
      332 3.291667     0   -5.36127
      333 3.291667     0   -5.36127
      334 3.333333   405  -5.502202
      335 3.291667     0  -5.502202
      336     3.25  5100          0
      337 3.291667     0          0
      338 3.333333   600  -.2165925
      339     3.25  1200  -.7100432
      340 3.291667     0  -.7100432
      341 3.291667     0  -.7100432
      342 3.291667     0  -.7100432
      343 3.333333  2100   -1.85448
      344 3.291667 40200   -1.85448
      345     3.25  6600   -1.85448
      346 3.333333   600 -2.0710726
      347     3.25  6600 -2.0710726
      348     3.25  1905  -3.034208
      349 3.333333   600 -3.2508006
      350     3.25  2100 -4.3952374
      351 3.333333  9750 -4.3952374
      352     3.25  1200  -4.888688
      end
      Last edited by Xiaoke Ye; 01 Oct 2019, 16:03.

      Comment


      • #4
        Look at the -replace ref_`x' = ...- command. The } at the end cannot be there; it has to be put on a separate line. Then it will run without difficulties.

        Comment


        • #5
          Originally posted by Clyde Schechter View Post
          Look at the -replace ref_`x' = ...- command. The } at the end cannot be there; it has to be put on a separate line. Then it will run without difficulties.
          Thanks for your reply.
          After correcting this minor mistake, the code can be run but it does not generate the same output as I got from levelsoff. In fact, it does not generate any output except "missing". I carefully examined both the code suggested by you and the code I wrote for myself, and I could not really find something wrong. This is peculiar.
          The runby command correctly identified my group identifier which is dateday_count. However, the command
          Code:
          forvalue x=1/`i'{  
          replace k_`x'=adjvol[_n-`x']*exp(total_log_rev_turnover-total_log_rev_turnover[_n-`x'+1])  
          replace ref_`x' = adjprc[_n-`x']*k_`x'  
          }
          only generated missing value.
          Is it possible that when I use runby, the "_n-`x' " is not identified?
          Last edited by Xiaoke Ye; 01 Oct 2019, 16:29.

          Comment


          • #6
            Well, I also got all missing values running this on the example data, but it made perfect sense for that. I assumed that the difficulty would not apply to the full data set. Here's the difficulty in the example date:

            Let's consider the first value of dateday_count, which is, in the example 253. The example data contains only one observation with dateday_count = 253. So when you get to the loop we start with x = 1. In the first iteration, we get
            Code:
                replace k_`x'=adjvol[_n-`x']*exp(total_log_rev_turnover-total_log_rev_turnover[_n-`x'+1])      
            // WHICH TRANSLATES TO
                replace k_1=adjvol[_n-1]*exp(total_log_rev_turnover-total_log_rev_turnover[_n-1+1])
            But since there is only a single observation with dateday_count = 253, adjvol[_n-1] refers to adjvol[0], which does not exist, and is treated as a missing value. The same logic applies to ref_`x'. And it also applies when `x' is 2, 3, etc. up to 253. And it is not hard to convince yourself that it applies equally to all values of dateday_count. This code needs a very large data set in order to produce any non-missing values at all, because it requires a large number of lagging observations with the same values of dateday_count.

            Comment


            • #7
              Looking back over the entire thread, I fear I may have sent you on a wild goose chase. My -runby- approach may be unworkable for your problem. I had originally interpreted your code as operating on separate blocks of observations that are defined by a common observation of datadate_count.

              But looking more carefully at #1, I see that is not true. When you write expressions like adjvol[_n-1], your intention is that observation _n-1 may well have a different value of datadate_count than observation _n does. If that's the case, the -runby- approach is not workable: -runby- segments the data into blocks having identical values of the -by()- variable(s) and does all calculations exclusively within one block at a time. So if this is the case, I think you need to just go back to your original code, and I apologize for having wasted your time on this.

              Comment


              • #8
                Originally posted by Clyde Schechter View Post
                Looking back over the entire thread, I fear I may have sent you on a wild goose chase. My -runby- approach may be unworkable for your problem. I had originally interpreted your code as operating on separate blocks of observations that are defined by a common observation of datadate_count.

                But looking more carefully at #1, I see that is not true. When you write expressions like adjvol[_n-1], your intention is that observation _n-1 may well have a different value of datadate_count than observation _n does. If that's the case, the -runby- approach is not workable: -runby- segments the data into blocks having identical values of the -by()- variable(s) and does all calculations exclusively within one block at a time. So if this is the case, I think you need to just go back to your original code, and I apologize for having wasted your time on this.
                No, you did not waste my time, I have learnt a great amount from you and become more familiar with the mechanism behind -runby- code. It is also good to know that there is not really alternative to my original code. And, thanks again.

                Comment

                Working...
                X