Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • forval loop with a time range and a large dataset: can it faster?

    Dear Stata users,

    I am using Stata 15.0

    I have a variable "Dyadic_Tie", and it should be counted how many observations have the same value in that variable, five years before the investment date, excluding the focal observation.

    To do so, I have made a loop, as displayed below, which runs. However, my dataset is relatively large (N= 449,488), and Stata has been running the loop for almost 2 hours now (and is still not finished). Is there another way to have the same calculated, in a faster way?

    I apologise that I have not included an example of the data using
    Code:
    dataex
    , but I do not want to interrupt Stata while it is running the loop; I am afraid that it may crash as it is too demanding for my MacBook then.

    Code:
    gen count_DyadicTie =.
    qui forval i = 1/`=_N' {  
    count if Dyadic_Tie == Dyadic_Tie[`i'] & _n != `i' & inrange(InvestmentDate[`i'], InvestmentDate-1825, InvestmentDate-0)
    
        replace count_DyadicTie =r(N) in `i'
    }
    Any help would be highly appreciated.
    Best,

    Nicky Joosse

  • #2
    Looks like

    Code:
    gen byte one = 1 
    rangestat (sum) one, interval(InvestmentDate -1825 0) by(Dyadic_Tie) excludeself
    where rangestat is from SSC.

    Comment

    Working...
    X