Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Randomness in optimize

    Dear All,

    a few years ago (in 2011) a message was posted by Matthew J. Baker to the older StataList, which attracted zero replies:
    https://www.stata.com/statalist/arch.../msg00390.html

    The message was about reproducibility of optimization results.

    I believe my addition to the code of a couple of commands solves this, but wanted to double-check if there is anything else that needs to be re-set for complete reproducing of the same results?


    Without the modification the code results in bb \ b such as:

    Code:
                      1              2
        +-------------------------------+
      1 |   .5710687637   -.2172820568  |
      2 |    .571059227   -.2172820568  |
        +-------------------------------+
                      1              2
        +-------------------------------+
      1 |   .5710754395   -.2172825336  |
      2 |    .571064949   -.2172825336  |
        +-------------------------------+
                      1              2
        +-------------------------------+
      1 |   .5710706711   -.2172822952  |
      2 |   .5710678101   -.2172832489  |
        +-------------------------------+


    With my modifications to the code (shown below) I consistently get bb \ b as:

    Code:
                      1              2
        +-------------------------------+
      1 |   .5710601807   -.2172832489  |
      2 |   .5710601807   -.2172832489  |
        +-------------------------------+


    in every run.

    1) Was set sortseed the missing bit in the original puzzle?
    2) Where specifically does the sort happen in the Matthew's code, so that it is sensitive to the set sortseed?

    Thank you, Sergiy Radyakin


    Code:
    clear all
    set more off
    set seed 8675309
    mata
    
    // fictional data
    X=invnormal(runiform(1000,1)):>.5
    Y=(-.2:+.5:*X:+invnormal(runiform(1000,1))):>0
    X=X,J(1000,1,1)
    E=invnormal(runiform(1000,1000))
    
    // simulated log likelihood-freq. simulator
    
    void crit(todo,b,y,X,E,crit,g,H)
    {
        real matrix Us
        Us=X*b':+E
        crit=y:*ln(rowsum(Us:>0)/cols(Us)):+
            (1:-y):*ln(rowsum(Us:<=0)/cols(Us))
        crit=colsum(crit)
    }    
    
    stata("set sortseed 12346")
    S=optimize_init()
    optimize_init_evaluator(S,&crit())
    optimize_init_evaluatortype(S,"d0")
    optimize_init_params(S,J(1,cols(X),0))
    optimize_init_technique(S,"nm")
    optimize_init_nmsimplexdeltas(S,J(1,cols(X),1))
    optimize_init_argument(S,1,Y)
    optimize_init_argument(S,2,X)
    optimize_init_argument(S,3,E)
    b=optimize(S)
    V=optimize_result_V(S)
    
    stata("set sortseed 12346")
    T=optimize_init()
    optimize_init_evaluator(T,&crit())
    optimize_init_evaluatortype(T,"d0")
    optimize_init_params(T,J(1,cols(X),0))
    //optimize_init_params(T,b)
    optimize_init_technique(T,"nm")
    optimize_init_nmsimplexdeltas(T,J(1,cols(X),1))
    optimize_init_argument(T,1,Y)
    optimize_init_argument(T,2,X)
    optimize_init_argument(T,3,E)
    bb=optimize(T)
    
    bb \ b
    
    end
    // I was looking only at bb \ b reported above so quietly here
    quietly {
      getmata (X*)=X
      getmata Y
      probit Y X1
    }
    
    /* end example */

  • #2
    Sergiy Radyakin --

    Well, that is very interesting indeed! I had no idea that one could set different seeds for different purposes.

    Thanks!

    Matthew J. Baker

    Comment

    Working...
    X