I'm writing a program which uses Mata to consider large numbers of random permutations of a matrix, to search for an optimal arrangement. In testing I'm using "set seed" and "mata: rseed()" to get consistent results. From one Stata session to another this works, but if I run the program twice in the same session (with a "clear all" between runs) the random numbers eventually (but not immediately) become inconsistent.
Should I need more than setting the seed plus "clear all" to restore the state of the Stata session to the beginning, at least in terms of PRNGs?
(Stata 14.2, on two separate machines).
Example results (by iteration 3, at least one of the 2000 random permutations differs):
Code to run the main code twice:
Main code (test.do):
Should I need more than setting the seed plus "clear all" to restore the state of the Stata session to the beginning, at least in terms of PRNGs?
(Stata 14.2, on two separate machines).
Example results (by iteration 3, at least one of the 2000 random permutations differs):
Code:
: pop = ga(dd, 2000) Iter 1 : Σ dist: min 977.377; mean: 1067.934; max: 1090.905 Iter 2 : Σ dist: min 950.5679; mean: 1039.75; max: 1064.551 Iter 3 : Σ dist: min 950.5679; mean: 1019.234; max: 1039.178 [...] : pop = ga(dd, 2000) Iter 1 : Σ dist: min 977.377; mean: 1067.934; max: 1090.905 Iter 2 : Σ dist: min 950.5679; mean: 1039.75; max: 1064.551 Iter 3 : Σ dist: min 950.5679; mean: 1019.247; max: 1039.178 [...]
Code:
do test.do clear all clear mata do test.do
Code:
// Clear all and set seeds
clear all
set seed 123456
mata: rseed(123456)
// Generate random data and make a dissimilarity matrix
set obs 50
forvalues x = 1/10 {
gen x`x'=rnormal()
}
matrix dissim dd = x1-x10, L2squared
mata
real scalar fitness (real matrix distmat, real matrix permdat) {
width = cols(distmat)
perm = transposeonly(order(transposeonly(permdat),1))
tempdm = distmat[perm,perm]
// Sum minor diagonal (offset 0 gives main diagonal)
offset = 1
f = sum(diagonal(tempdm[1..(width-offset),(1+offset)..width]))
return(f)
}
// Genetic algorithm function: look for a matrix permutation that minimises the fitness function
real matrix function ga (real matrix distmat, real scalar npop) {
width = rows(distmat)
npop = 2*floor(npop/2) // even
nparents = floor(npop/2) // change from 50% to intensify selection
nsurv = nparents
nnew = floor(npop*0.20)
ntoconv = ceil(npop*0.1)
// Use top 50% as core. They procreate to replace 30%, and 20% is new random
// Progress data is on top 50%
// Create a random population, each row a permutation
population = rnormal(npop,width,0,1)
sigmadist = J(npop,1,.)
// Main iteration
iter = 0
while (iter==0 |(max(sigmadist[1..ntoconv]) != min(sigmadist[1..ntoconv]))) {
newpop = population
iter++
// Calculate fitness (lower is better)
for (i=1; i<=npop; i++) {
sigmadist[i] = fitness(distmat, newpop[i,.])
}
// Order the new medoid sets per fitness (best (lowest) first)
newpop = population[order(sigmadist,1),][1..nparents,]
sigmadist = sigmadist[order(sigmadist,1)]
"Iter "+strofreal(iter,"%-3.0f")+": Σ dist: min "+strofreal(min(sigmadist[1..ntoconv]))+"; mean: "+strofreal(mean(sigmadist[1..ntoconv]))+"; max: "+strofreal(max(sigmadist[1..ntoconv]))
displayflush()
// Keep top 50%
population[1..nsurv,] = newpop[1..nsurv,]
// Crossover section
// Top 50% used for crossover to create 30%, remaining 20% new
for (i=1+nsurv; i<=npop-nnew; i++) {
j = 1+floor((1-runiform(1,1)^0.1)*nparents)
k = 1+floor((1-runiform(1,1)^0.1)*nparents)
l = runiformint(1,1,1,width-1)
population[i,1..l] = newpop[j,1..l]
population[i,l+1..width] = newpop[k,l+1..width]
}
population[(npop-nnew+1)..npop,] = rnormal(nnew, width, 0, 1)
}
return(population)
}
end
mata
dd = st_matrix("dd")
pop = ga(dd, 2000)
end

Comment