Using -tuples- to generate combinations

John Mullahy

Join Date: Dec 2016

Posts: 752
#1

Using -tuples- to generate combinations

27 Nov 2019, 12:10

I have successfully used for smaller problems the -tuples- command contributed by Luchman, Klein, and Nick Cox

Code:

ssc describe tuples

I am now hoping to use the program with larger lists (e.g. n=32) but am finding that the program is freezing Stata (on my computer, at least) at around 28 list elements.

For example, this command works fine:

Code:

tuples 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25, min(5) max(5)

but after laboring for a while processing this command

Code:

tuples 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28, min(5) max(5)

I receive on my Mac an alert: "Your system has run out of application memory" and I need then to Force-Quit Stata.

Does anyone have any experience with setting any of the program's options at other than defaults that might help here? Or any suggestions on memory management? Or is anyone aware of other programs that would generate the same output as does -tuples-? Thanks in advance for any suggestions.
Tags: None
daniel klein

Join Date: Mar 2014

Posts: 3883
#2

27 Nov 2019, 12:35

This is from the help-file for tuples that I have on my disk; I think Joe, Nick and I have agreed on updating the help-files on SSC, but I must have forgotten to actually send it to Kit.

Methods to produce tuples

tuples default method creates an n x (2^n-1) indicator matrix to produce the tuples. This is usually the best
(fastest) method. However, if you have long lists (n>20) and specify the min() and/or max() options to select
a subset of all possible tuples, consider switching to one of the other methods, kronecker or cvp.

If you specify the min() and/or max() options, both kronecker and cvp require less memory than the default
method and sometimes produce tuples more quickly.

For long lists (n>20), the default method might require more memory than your operating system provides. In
these cases, tuples will exit with an error message and return code r(3900); You might still be able to obtain
a smaller subset of tuples by specifying min() and/or max() along with one of kronecker or cvp.

For long lists, say n=21, if the fraction of tuples selected with min() and/or max() is small (<0.01), and if
the number of elements in a tuple is also small, say min(2) max(4), kronecker tends to produce tuples more
quickly than the default method. If the fraction of tuples selected is very small (<0.005), and if the number
of elements in a tuple is (very) large, say min(18) max(19), cvp tends to produce tuples more quickly than the
default method. Use

. mata : rowsum(comb(n, (min..max)))/(2^n-1)

to determine the fraction of tuples selected. Usually, the fraction of tuples selected will be larger than
0.01 and you want to stick with the default method. If you run out of memory, however, switching the method is
your only option.

From the above, I suggest trying kronecker or cvp. If you still run out of memory, I believe that the naive method could work, theoretically; however, I also believe that it would not be feasible in terms of computation time.

Best
Daniel
Comment
John Mullahy

Join Date: Dec 2016

Posts: 752
#3

27 Nov 2019, 12:54

Thanks so much, Daniel. It looks like the kronecker option does the trick (at least for my problem). I will also play around with cvp but it looked from one try that it was taking much longer than kronecker.

Code:

. timer on 1 . qui tuples 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32, min(5) max(5) > kronecker . di `ntuples' 201376 . di comb(32,5) 201376 . timer off 1 . timer list 1: 9.18 / 1 = 9.1790

Much appreciate your speedy and helpful reply.
John
Comment
John Mullahy

Join Date: Dec 2016

Posts: 752
#4

27 Nov 2019, 14:59

P.S. In case this information might be useful to others, I checked the performance of tuples under several different scenarios, using the kronecker option in all cases (n=list size).

Code:

1. n=16, min(5), max(5) 2. n=16, min(10), max(10) 3, n=32, min(5), max(5) 4. n=48, min(5), max(5)

Here is the timer report

Code:

. timer list 1: 0.14 / 1 = 0.1370 2: 1.73 / 1 = 1.7250 3: 9.10 / 1 = 9.0950 4: 156.57 / 1 = 156.5720

Larger problems—e.g. n=32 with min(10) and max(10), or n=64 with min(5) and max(5)—resulted in the program freezing and the "Your system has run out of application memory" alert.

(I should note I'm running Stata/SE v16.0 on a MacBook Pro, 2.8 GHz Intel Core i5 Processor, 8GB 1600MHz DDR3 Memory, under OS 10.12.6.)
2 likes
Comment

John Mullahy

Join Date: Dec 2016
Posts: 752

01 Dec 2019, 11:01

Daniel: For what it's worth I'm posting an inelegant (and seemingly obvious) Mata approach that relies simply on nested loops. The code below is set up to handle a single n-tuple up to n=6, although it can obviously be extended. (I tried initially to set up a single generic nested loop to handle any n but this seemed more trouble than it was worth; this could probably be done easily by someone more adept at programming than am I.)

Code:

capture mata mata drop loop*() ploop()

mata

function loop1(v) {
 real matrix ctch,tup,c,bot,top
  tup=1
  c=0
  bot=1
  top=cols(v)
  ctch=J(comb(cols(v),tup),tup,.)
 for (j1=bot;j1<=top;j1++) {
   c=c+1
   ctch[c,.]=j1
 }
return(ctch)
}

function loop2(v) {
 real matrix ctch,tup,c,bot,top
  tup=2
  c=0
  bot=1
  top=cols(v)
  ctch=J(comb(cols(v),tup),tup,.)
 for (j1=bot;j1<=(top-(tup-1));j1++) {
 for (j2=j1+1;j2<=top;j2++) {
   c=c+1
   ctch[c,.]=j1,j2
 }}
return(ctch)
}

function loop3(v) {
 real matrix ctch,tup,c,bot,top
  tup=3
  c=0
  bot=1
  top=cols(v)
  ctch=J(comb(cols(v),tup),tup,.)
 for (j1=bot;j1<=(top-(tup-1));j1++) {
 for (j2=j1+1;j2<=(top-(tup-2));j2++) {
 for (j3=j2+1;j3<=top;j3++) {
   c=c+1
   ctch[c,.]=j1,j2,j3
 }}}
return(ctch)
}

function loop4(v) {
 real matrix ctch,tup,c,bot,top
  tup=4
  c=0
  bot=1
  top=cols(v)
  ctch=J(comb(cols(v),tup),tup,.)
 for (j1=bot;j1<=(top-(tup-1));j1++) {
 for (j2=j1+1;j2<=(top-(tup-2));j2++) {
 for (j3=j2+1;j3<=(top-(tup-3));j3++) {
 for (j4=j3+1;j4<=top;j4++) {
   c=c+1
   ctch[c,.]=j1,j2,j3,j4
 }}}}
return(ctch)
}

function loop5(v) {
 real matrix ctch,tup,c,bot,top
  tup=5
  c=0
  bot=1
  top=cols(v)
  ctch=J(comb(cols(v),tup),tup,.)
 for (j1=bot;j1<=(top-(tup-1));j1++) {
 for (j2=j1+1;j2<=(top-(tup-2));j2++) {
 for (j3=j2+1;j3<=(top-(tup-3));j3++) {
 for (j4=j3+1;j4<=(top-(tup-4));j4++) {
 for (j5=j4+1;j5<=top;j5++) {
   c=c+1
   ctch[c,.]=j1,j2,j3,j4,j5
 }}}}}
return(ctch)
}

function loop6(v) {
 real matrix ctch,tup,c,bot,top
  tup=6
  c=0
  bot=1
  top=cols(v)
  ctch=J(comb(cols(v),tup),tup,.)
 for (j1=bot;j1<=(top-(tup-1));j1++) {
 for (j2=j1+1;j2<=(top-(tup-2));j2++) {
 for (j3=j2+1;j3<=(top-(tup-3));j3++) {
 for (j4=j3+1;j4<=(top-(tup-4));j4++) {
 for (j5=j4+1;j5<=(top-(tup-5));j5++) {
 for (j6=j5+1;j6<=top;j6++) {
   c=c+1
   ctch[c,.]=j1,j2,j3,j4,j5,j6
 }}}}}}
return(ctch)
}

function ploop(v,tup) {
 pointer matrix p
 p=J(6,1,NULL)
 p[1]=&loop1()
 p[2]=&loop2()
 p[3]=&loop3()
 p[4]=&loop4()
 p[5]=&loop5()
 p[6]=&loop6()
 return((*p[tup])(v))
}


end

While it took a while (I forgot to set a timer, but perhaps an hour or two) ploop((1..78),6) experienced no difficulties returning a 256851595x6 matrix of 6-tuples using the computer described in #4. (78 and 6 are parameters of particular interest in an application I'm working on.)

Code:

: z=ploop((1..78),6)

: dim(z)
               1           2
    +-------------------------+
  1 |  256851595           6  |
    +-------------------------+

It seems this approach could be generalized easily to vector arguments more complicated than simple ranges like (1..78), and as well could handle multiple different tuples (perhaps most simply by embedding ploop(.) in another loop and perhaps returning a pointer vector that accommodated the different vector lengths that would correspond to the different tuples).

Comment

Mike Lacy

Join Date: Apr 2014

Posts: 2421
#6

01 Dec 2019, 11:54

John--After messing around writing combination generators for years (generally translating FORTRAN to Stata or Mata), I discovered the -mm_subsets(n,k)- command in the -moremata- package available at SSC. It returns all the combinations in a matrix, and is very fast, per the following examples, which run in about 2 and 4 sec. on my very modest laptop:

Code:

mata: timer_on(1); comblist = mm_subsets(32,5); timer_off(1); timer(1) mata: timer_on(1); comblist = mm_subsets(48,5); timer_off(1); timer(1)

There's also an -mmsubset()- command in that package, more complicated to use, which returns the subsets one a time, rather than putting them into a matrix.
Comment
John Mullahy

Join Date: Dec 2016

Posts: 752
#7

02 Dec 2019, 06:54

MIke: Thanks very much for this. I was unaware of mm_subsets(...). I gave it a couple test drives and it looks quite handy. However, it also appears that for "large" problems the naive nested-looping approach I noted in #5 is faster. For instance

Code:

mata vtop=55 tup=6 timer_clear() timer_on(1) z=mm_subsets(vtop,tup) timer_off(1) timer_on(2) z=ploop(1..vtop,tup) timer_off(2) timer() end

resulted in

Code:

: timer() ------------------------------------------------------------------------------------------------------------------ timer report 1. 58.5 / 1 = 58.473 2. 28.4 / 1 = 28.365 ------------------------------------------------------------------------------------------------------------------

Also the (78;6) parameters I noted in #5 froze mm_subsets but ploop(...) yielded correct results.

Obviously the looping approach is inelegant but for my particular problem it seems to outperform the others I've tried (tuples in Stata and mm_subsets in Mata).
Comment
daniel klein

Join Date: Mar 2014

Posts: 3883
#8

02 Dec 2019, 08:13

Thanks, John and Mike, for valuable contributions.

Unfortunately, I do not have enough time to look into this in more detail right now. I will do so as soon as I find the time. As far as I understand, both ploop() and mm_subsets() are (much) better than tuples regarding both speed and memory usage for problems of the form

Code:

tuples list , min(#) max(#)

where # in min() and max() is the same number and/or preferably either small or very large compared to the number of elements in a list. tuples, I guess, was initially written to obtain all singletons, distinct pairs, etc. of the elements in a given list. To obtain such complete lists, we would need to loop over ploop() or mm_subsets() starting with #=1 and incrementing up to #=length(list). Also, going through a list of successive integers, as ploop() and mm_subsets() do, takes (at least a little) less time than picking items/elements from a (string) vector. Whether the two approaches would still outperform tuples in these more general settings, I cannot tell (yet); I would not mind if it were so.

Best
Daniel
Comment

Mike Lacy

Join Date: Apr 2014
Posts: 2421

02 Dec 2019, 09:29

This discussion prompted me to dig into my archives, where I found a "translation" I had written in Mata of an old Fortran combinatorial algorithm:
Gentleman, Jane F. "Algorithm AS 88: Generation of All NCR Combinations by Simulating Nested Fortran DO Loops." Journal of the Royal Statistical Society. Series C (Applied Statistics) 24, no. 3 (1975): 374-376.

To my surprise, at least for the current problem, it's faster than mm_subsets() and essentially identical in speed to what John did. My vague recollection is that when I compared it so some algorithm from Knuth, it was relatively slow. However, for what it's worth, here it is. I don't think my implementation is particularly nice, as I translated the original pretty literally without any particular consideration of Mata's features. Even most of the comments, if I recall correctly, come from the original. (Side note: I suppose this discussion should go to the Mata forum, but it's too late now ...)

Code:

// gnckm(n, k)
// Returns a Mata matrix of indices to all the combinations of N things taken k
// at a time.  Strictly speaking: It returns a matrix in which each row is a
// vector of k integers taken from 1, ..., N
//
version 15.0 // almost certainly will work on lower versions
*! Author: Michael Lacy
*! Date: 12 Dec 2018
mata:
real matrix gnckm(real scalar n, real scalar k)
//
// Algorithm translated from Algorithm AS 88  Appl. Statist. (1975) Vol.24, No. 3.  J. Gentleman.
{
  real rowvector j
  real scalar done
  real scalar i, L  // indexes
  real scalar kount
  real scalar nmk
  real matrix allcomb
//
allcomb = J(comb(n,k), k, .) // will hold the matrix
kount =  0
nmk = n - k
//  Initialize j to lower limit separately, since lower limit for
//  each index depends on lower limit for previous index.
j = J(1,k,.)
i = 1
j[1] = 1
done = 0
while (!done) {  // repeat until
  if (i != k) {
     for (L = i + 1;  L <= k; ++L) {
        j[L] = j[L - 1] + 1
     }   
  }
  //
  // Put the current combination into the matrix to be returned..
  kount = kount + 1
  allcomb[kount,.] = j //
  //
  // Back to generating combinations
  // Increment the first possible index (of loop i) among indices of
  // loops k, k-1,...,1
  i = k
  while ( (j[i] >= nmk + i) & (i > 0) ) {
     i = i-1
     if (i <=0) {
        break  // avoids evaluating j[i] when (i == 0)
     }
  }   
  if (i > 0) {
     j[i] = j[i] + 1   
  }  
  done = (i <= 0)
}    
  return(allcomb)
} // end of gnckm
end
//

Comment

John Mullahy

Join Date: Dec 2016

Posts: 752
#10

02 Dec 2019, 12:23

Thanks for your helpful insights, Daniel and Mike.
Comment

daniel klein

Join Date: Mar 2014
Posts: 3883

#11

02 Dec 2019, 16:39

I am surprised that the code in #9 appears to be much slower than mm_subsets(). It appears as if both implement essentially the same algorithm.

Anyway, here is a more generalized version of that code, which picks items from an arbitrary list

Code:

version 11.2

mata :

mata set matastrict on

transmorphic matrix cmbn(transmorphic rowvector list, real scalar k)
{
    real         rowvector idx
    real         scalar    n, c, i, j
    transmorphic matrix    Cmbn
    
    if ( (k<0) | (k>(n=cols(list))) ) _error(3300)
    
    Cmbn = J((c=comb(n, k)), k, missingof(list))
    
    if ( k ) {
        idx  = (1..k)
        while ( 42 ) {
            Cmbn[c--, ] = list[idx]
            i = k
            while (idx[i] == n-k+i) if ( !(--i) ) break
            if ( !i ) break
            idx[i] = idx[i]+1
            for (j=i+1; j<=k; ++j) idx[j] = idx[j-1]+1
        }
    }
    
    return( Cmbn )
}

end

Example

Code:

. mata : cmbn(tokens("a b c d"), 3)
       1   2   3
    +-------------+
  1 |  b   c   d  |
  2 |  a   c   d  |
  3 |  a   b   d  |
  4 |  a   b   c  |
    +-------------+

. 
end of do-file

Best
Daniel

Comment

Mike Lacy

Join Date: Apr 2014

Posts: 2421
#12

02 Dec 2019, 19:41

Daniel-- For me, my implementation of the Gentleman algorithm runs in about 1/2 the time of as mm_subsets(), i.e. it's not slower. It doesn't have any error checking etc., though.
Here's an example:

Code:

mata: n = 55; k = 5 mata: timer_clear(); timer_on(1); C = gnckm(n, k); timer_off(1) mata: timer_on(2); C = mm_subsets(n, k); timer_off(2); timer()
Comment
daniel klein

Join Date: Mar 2014

Posts: 3883
#13

02 Dec 2019, 23:35

Originally posted by Mike Lacy View Post

Daniel-- For me, my implementation of the Gentleman algorithm runs in about 1/2 the time of as mm_subsets(), i.e. it's not slower.

Mike, I am sorry I had that one mixed up. I am still a bit surprised by the difference in running time, given the same algorithm. I believe that building the results matrix in advance, using J(), is faster than building it up step by step within the loop. Erorr checking should be trivial and not affect running time a lot.

Best
Daniel
Comment
daniel klein

Join Date: Mar 2014

Posts: 3883
#14

12 Jan 2020, 23:59

Thanks to Kit Baum, an updated version of tuples is now available from SSC. We have implemented a variation of Algorithm AS 88 (Gentleman 1975) that Mike Lacy has illustrated in #9. The new method to produce tuples is available via option ncr. We have also extended the discussion of methods to produce tuples in the help file.

Thanks to John Mullahy and William Buchanan for bringing this up; thanks to Mike Lacy for pointing to Algorithm AS 88 (Gentleman 1975) and providing code.

Best
Daniel

Gentleman, J. F. 1975. Algorithm AS 88: Generation of All NCR Combinations by Simulating Nested Fortran DO Loops. Journal of the Royal Statistical Society. Series C (Applied Statistics), 24(3), pp. 374--376.
2 likes
Comment
John Mullahy

Join Date: Dec 2016

Posts: 752
#15

13 Jan 2020, 06:25

Thank you, Daniel. I look forward to giving the new version a test drive.
Comment

Announcement