I'm writing a program which uses Mata to consider large numbers of random permutations of a matrix, to search for an optimal arrangement. In testing I'm using "set seed" and "mata: rseed()" to get consistent results. From one Stata session to another this works, but if I run the program twice in the same session (with a "clear all" between runs) the random numbers eventually (but not immediately) become inconsistent.
Should I need more than setting the seed plus "clear all" to restore the state of the Stata session to the beginning, at least in terms of PRNGs?
(Stata 14.2, on two separate machines).
Example results (by iteration 3, at least one of the 2000 random permutations differs):
Code to run the main code twice:
Main code (test.do):
Should I need more than setting the seed plus "clear all" to restore the state of the Stata session to the beginning, at least in terms of PRNGs?
(Stata 14.2, on two separate machines).
Example results (by iteration 3, at least one of the 2000 random permutations differs):
Code:
: pop = ga(dd, 2000) Iter 1 : Σ dist: min 977.377; mean: 1067.934; max: 1090.905 Iter 2 : Σ dist: min 950.5679; mean: 1039.75; max: 1064.551 Iter 3 : Σ dist: min 950.5679; mean: 1019.234; max: 1039.178 [...] : pop = ga(dd, 2000) Iter 1 : Σ dist: min 977.377; mean: 1067.934; max: 1090.905 Iter 2 : Σ dist: min 950.5679; mean: 1039.75; max: 1064.551 Iter 3 : Σ dist: min 950.5679; mean: 1019.247; max: 1039.178 [...]
Code:
do test.do clear all clear mata do test.do
Code:
// Clear all and set seeds clear all set seed 123456 mata: rseed(123456) // Generate random data and make a dissimilarity matrix set obs 50 forvalues x = 1/10 { gen x`x'=rnormal() } matrix dissim dd = x1-x10, L2squared mata real scalar fitness (real matrix distmat, real matrix permdat) { width = cols(distmat) perm = transposeonly(order(transposeonly(permdat),1)) tempdm = distmat[perm,perm] // Sum minor diagonal (offset 0 gives main diagonal) offset = 1 f = sum(diagonal(tempdm[1..(width-offset),(1+offset)..width])) return(f) } // Genetic algorithm function: look for a matrix permutation that minimises the fitness function real matrix function ga (real matrix distmat, real scalar npop) { width = rows(distmat) npop = 2*floor(npop/2) // even nparents = floor(npop/2) // change from 50% to intensify selection nsurv = nparents nnew = floor(npop*0.20) ntoconv = ceil(npop*0.1) // Use top 50% as core. They procreate to replace 30%, and 20% is new random // Progress data is on top 50% // Create a random population, each row a permutation population = rnormal(npop,width,0,1) sigmadist = J(npop,1,.) // Main iteration iter = 0 while (iter==0 |(max(sigmadist[1..ntoconv]) != min(sigmadist[1..ntoconv]))) { newpop = population iter++ // Calculate fitness (lower is better) for (i=1; i<=npop; i++) { sigmadist[i] = fitness(distmat, newpop[i,.]) } // Order the new medoid sets per fitness (best (lowest) first) newpop = population[order(sigmadist,1),][1..nparents,] sigmadist = sigmadist[order(sigmadist,1)] "Iter "+strofreal(iter,"%-3.0f")+": Σ dist: min "+strofreal(min(sigmadist[1..ntoconv]))+"; mean: "+strofreal(mean(sigmadist[1..ntoconv]))+"; max: "+strofreal(max(sigmadist[1..ntoconv])) displayflush() // Keep top 50% population[1..nsurv,] = newpop[1..nsurv,] // Crossover section // Top 50% used for crossover to create 30%, remaining 20% new for (i=1+nsurv; i<=npop-nnew; i++) { j = 1+floor((1-runiform(1,1)^0.1)*nparents) k = 1+floor((1-runiform(1,1)^0.1)*nparents) l = runiformint(1,1,1,width-1) population[i,1..l] = newpop[j,1..l] population[i,l+1..width] = newpop[k,l+1..width] } population[(npop-nnew+1)..npop,] = rnormal(nnew, width, 0, 1) } return(population) } end mata dd = st_matrix("dd") pop = ga(dd, 2000) end
Comment