Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • tuples vs nested loops for order of macro list

    Hello Statalisters,

    I'm attempting to run some simulations that alternate the number of stratification factors within a randomisation plan, and have approached a cross-roads as to which direction I continue coding. I won't bore you with all the code, but essentially i'm trying to find a better way than tuples (ssc install tuples) / nested loops to alternate the first variable in the local. As way of an example this is some skeleton code that runs with tuples

    Code:
    clear all
    set obs 10
    cd "/Users/Desktop/"
    
    postfile buffer pvaltre pvalcon str50 lhs using "tuplesimtest.dta", replace
    
    forvalues i = 1/10{
        gen v`i' = rbeta(3,1)
    }
    gen binary = mod(_n,2)
    
    ds binary, not skip(1)
    local stratlist "`r(varlist)'"                                                         
    tuples `stratlist'
    forvalues i = 1/`ntuples'{
        reg `tuple`i'' binary, r
        matrix A = r(table) 
        post buffer (A[4,1]) (A[4,2]) ("`tuple`i''")
    }                    
    postclose buffer
    
    use tuplesimtest.dta, clear
    sort lhs
    If you run it you'll see that the first variable only changes when it needs to be dropped from the macro in order to form a new combination. Otherwise, the first item in the string lhs is v1. The next snippet is even less effective, but shows another line of my thinking.

    Code:
    clear all
    set obs 10
    cd "/Users/Desktop/"
    
    local strata = ""
    postfile buffer pvaltre pvalcon str50 lhs using "nestedsimtest.dta", replace
    
    forvalues i = 1/10{
        gen v`i' = rbeta(3,1)
    }
    gen binary = mod(_n,2)
    
    ds binary, not skip(1)
    local stratlist "`r(varlist)'"                                                         
    local strata
    forval stratnum = 1/9{
        local strat: word `stratnum' of `stratlist'
        local strata `"`strata' `strat'"'                                                 
        qui egen strata=group(`strata')                                
    foreach var of local strata{                                                 
        qui reg `var' binary, r                                                          
        matrix A = r(table)                                                         
        post buffer (A[4,1]) (A[4,2]) ("`strata'")
    }
    qui drop strata 
    }                
    postclose buffer
    
    use nestedsimtest.dta, clear
    I would like to arrive at a place where I can alternate the first item within the macro tuples in the first instance or strata in the second. Perhaps there's a way of reordering locals on the fly, much like order(varname), after(othervar) for variables.

    Many thanks in advance!




  • #2
    It is not clear to me what exactly you want to do here. I get that you start with the list v1, v2, ..., v10. What do you mean by

    alternate the first item within the macro
    In other words, what sequence are you looking for here?

    Best
    Daniel

    Comment


    • #3
      Hi Daniel,

      Thanks for your comment. In the latter of the two above examples the full size of the macro is v1, v2,...,v10, and thefirst string item is always v1. However, I would like to be able to change the first string item to the other string items in the macro. So, the macro would not not always start with v1 but it would loop through all the variable names within the macro and reorder it so that it started off with a series of macros starting with v1 and then a series of macros starting with v2, etc., up to v10.

      This might look something like this:

      Code:
      v1
      v1,v2
      v1,v2,...,v10, and then
      
      v2
      v2,v3
      v2,v3,...v9,v1, and then
      
      v3
      v3,v4
      v3,v4,...,v1,v2, etc., up to
      
      v10
      v10,v1
      v10,v1,...,v8,v9
      where the variable name that was previously the first item within the macro is put to the end once the macro has built up to include all the variables
      Last edited by Chris Larkin; 09 Oct 2016, 06:51.

      Comment


      • #4
        If order matters so that you want (e.g.)

        v1,v2,...,v10 and v10,v1,...,v8,v9 then tuples isn't even a poor solution for you. It's not a solution at all.

        Comment


        • #5
          So what is wrong with the nested loop?

          Code:
          // create list of variable names
          forvalues j = 1/10 {
              local all `all' v`j'
          }
          
          // nested loop
          local n : word count `all'
          forvalues j = 1/`n' {
              gettoken build all : all
              display "`build'"
              foreach next of local all {
                  local build `build' `next'
                  display "`build'"
              }
              gettoken first build : build
              local all `build' `first'
          }
          Best
          Daniel
          Last edited by daniel klein; 09 Oct 2016, 09:30.

          Comment


          • #6
            Daniel, this is great! I've never used the gettoken command before, and there are some unfamiliar macro functions for me here too. I'm integrating your snippet it into my larger body of code though and am starting to see how it works (obviously while reading the documentation too). Much appreciated!

            Comment


            • #7
              Hello again,

              I've been trying to integrate your method into a larger body of code I have put together for running some simulations, but am running into some issues. I appreciate this is a relatively large bit of code for a Statalist post, so please feel free to ignore if it's too bothersome to figure out. That said, if you have some thoughts i would be grateful to hear them.

              I'm trying to put together simulations to test the optimal number of factors for stratified randomisation. I create a series of observable variables, and corresponding unobservable variables of the same distributions, and then randomise stratifying on the observables in sequence (i.e. first just randomising on one observable, and then another, and then another, etc.). I then run a simple OLS regression of the binary treatment indicator on the strata, and save the p-value. In a following bit of code, I look for imbalance at alpha <= 0.05 and plot the probability of this for each observable and unobservable.

              I have previously, and successfully, run similar simulations that perform the same process but without changing the order of the stratification variables. So, I stratify first on obs1, then obs1, and obs2, then obs1, obs2, and obs3, etc. However, because i'm always stratifying on the same variable first all the successive stratifications tell me is the probability of imbalance in my unobservables by stratifying on obs2, for example, after already stratifying on obs1. I would like something more general, which permits me to test just the number of stratification factors for a number of different variable types.

              My code thus far is below. You will notice that the bit in the middle is an attempt to integrate your suggestion daniel klein, but unsuccessfully. When running these properly, I will increase the number of sample sizes and have many more iterations than 1.

              Code:
              clear all
              
              ** Setting up a drive to export results
              global drive "/Users/chris.larkin/Documents/Not work"
              
              ** Setting up a log to view output
              capture log close _all
              log using "stratification_sims$S_DATE.txt", append text name("Second Attempt")
              
              
              local sample 100 200 300            //set up local for different sample sizes
              local strata = ""
              postfile buffer pvaltre pvalcon str10 lhs str30 strata size iter using "sim_k_test.dta", replace
              
              timer clear 1
              timer on 1                                                                                                      //setting a timer
              forval iter =1/1{                                                                                              //number of times the simulation is run
                  foreach size of local sample{
                   preserve
                      qui set obs `size'                                                                                   //set different sample sizes
                      qui gen id = _n                                                                                      //generate a unique id
                      local num 6                                                                                           //setting up local for number of obs and unobs
                      quietly forval i = 1/`num'{
                          if `i' == 1              local fcn round(runiform(),1)                                 //for a randomly-drawn binary var w expected baseline 0.50
                          else if `i' == 2           local fcn round(rbeta(3,0.9),1)                         //for a binary var w expected baseline 0.10
                          else if `i' == 3           local fcn round(rbeta(3,0.5),1)                         //for a binar var w expected baseline 0.05     
                          else if `i' == 4         local fcn mod(_n,3)                                            //for a categorical var with 3 cats
                          else if `i' == 5           local fcn rnormal(), nq(5)                                 //for a normally-distributed var 
                          else if `i' == 6        local fcn rnormal(), nq(2)
                          if `i'<= 4                local cmd generate
                          else                    local cmd xtile
                          `cmd' obs`i' = `fcn'
                          `cmd' unobs`i' = `fcn'
                      }
              
                          qui ds obs1 obs2 obs3 obs4 obs5 obs6, skip(1)
                          local stratlist "`r(varlist)'"                                                                   //store stratification vars in a local
                          di as text "`stratlist'"
                          local n : word count `stratlist'
                          di "`n'"
                          forval stratnum = 1/`n'{
                          
                              gettoken build stratlist : stratlist
                              *display in red "`build'"
                              *display as text "`stratlist'"
              
                              //I think I might have to stratify and run the model here to do a stratification on just the first variable
                             
                              foreach next of local stratlist {
                                  local stratlist `stratlist' `next'
                                  display as text "`stratlist'"
                                  display in red "--------BREAK HERE-------"
                                  
                              }
                              
                               gettoken first build : build
                               local stratlist `build' `first'
                         }                                                                                                          //looping through stratification vars and adding a new one on each iteration
                              *di in red "the below randomisation is stratified on:" `"`strata'"'
                              
                              qui egen strata=group(`strata')                                                      //gen a variable that makes unique groups based on strat vars
                              set seed 31540                                                                              //setting a seed for replicability
                              qui gen randomnum = runiform()                                                   //gen a random number
                              qui bysort strata: egen order=rank(randomnum)                           //gen a rank order var based on the random number
                              qui bysort strata: gen treat = (order <= _N/2)                                //assigning condition based on rank
                              
                              qui ds unobs1 unobs2 unobs3 unobs4 unobs6 unobs7, skip(1)
                              local unobs "`r(varlist)'"
                              local variables `strata' `unobs'                                                       //putting observables and unobservables in one local
                              qui foreach var of local variables{                                                  //balance test
                                  reg `var' treat, r                                                                          //regressing covariate on treatment and testing for imbalance
                                  matrix A = r(table)                                                                      //convert regression results to matrix
                                  post buffer (A[4,1]) (A[4,2]) ("`var'") ("`strata'") (`size') (`iter')    //putting p-values and IV in buffer
                              }
                                  
                                  *des, s
                                  qui drop strata randomnum order treat
                              }
                          }
                   restore
                  }
                  di as text "Have finished iteration" `"`iter'"'
              
              postclose buffer
              timer off 1                                                                                                      //stopping timer
              timer list 1                                                                                                     //displaying time                                                                                                
              
                  //-----------------SIMULATIONS COMPLETE-------------------//
              
              qui use "sim_k_test.dta", clear

              As it stands at the moment, this loop is stuck in a continuous cycle and never exits. This I can overcome, but the output from the lines display as text "`stratlist'" and display in red "--------BREAK HERE-------" is perplexing. A sample of it is below. Note it does not say obs3 but bs3.

              Code:
               obs2 bs3 obs4 obs5 obs6 obs2 bs3 obs4 obs5 obs6 obs2 bs3 obs4 obs5 obs6 obs2 bs3 ob
              > s4 obs5 obs6 obs2 bs3 obs4 obs5 obs6 obs2 bs3 obs4 obs5 obs6 obs2 bs3 obs4 obs5 obs
              > 6 obs2 bs3 obs4 obs5 obs6 obs2 bs3 obs4 obs5 obs6 obs2 bs3 obs4 obs5 obs6 obs2 bs3 
              > obs4 obs5 obs6
              --------BREAK HERE-------
              obs2 obs3 obs4 obs5 obs6 obs2 bs3 obs4 obs5 obs6 obs2 bs3 obs4 obs5 obs6 obs2 bs3 obs
              > 4 obs5 obs6 obs2 bs3 obs4 obs5 obs6 obs2 bs3 obs4 obs5 obs6 obs2 bs3 obs4 obs5 obs6
              >  obs2 bs3 obs4 obs5 obs6 obs2 bs3 obs4 obs5 obs6 obs2 bs3 obs4 obs5 obs6 obs2 bs3 o
              > bs4 obs5 obs6 obs2 bs3 obs4 obs5 obs6 obs2 bs3 obs4 obs5 obs6 obs2 bs3 obs4 obs5 ob
              > s6 obs2 bs3 obs4 obs5 obs6 obs2 bs3 obs4 obs5 obs6 obs2 bs3 obs4 obs5 obs6 obs2 bs3
              >  obs4 obs5 obs6 obs2 bs3 obs4 obs5 obs6 obs2 bs3 obs4 obs5 obs6 obs2 bs3 obs4 obs5 
              > obs6 obs2 bs3 obs4 obs5 obs6 obs2 bs3 obs4 obs5 obs6 obs2 bs3 obs4 obs5 obs6 obs2 b
              > s3 obs4 obs5 obs6 obs2 bs3 obs4 obs5 obs6 obs2 bs3


              Comment


              • #8
                For the moment, ignoring everything else and concentrate on

                Code:
                forval stratnum = 1/`n'{
                    gettoken build stratlist : stratlist
                    foreach next of local stratlist {
                        local stratlist `stratlist' `next'
                        display as text "`stratlist'"
                        display in red "--------BREAK HERE-------"
                    }
                    gettoken first build : build
                    local stratlist `build' `first'
                }
                I suggest you re-read the documentation for gettoken and look at my example code very carefully. Pay special attention to where I use the local all (which corresponds to your stratlist) and where I use build. Try not just to correct your code (which you can probably do pretty quickly) but understand why your code implements a closed-loop.

                Best
                Daniel

                Comment


                • #9
                  Daniel, do forgive my incredibly tardy response. Your comments here were very helpful; alas it took me some time to find the headspace to get back into this code. I managed to solve my problem, and have ended up repeating the randomisation and balance test part twice so I can first include just the first token of the macro and then all the others.

                  Many many thanks for your advice! And for taking the time to read through my initial code

                  For posterity, it now looks like this:

                  Code:
                  ** Setting up a log to view output
                  capture log close _all
                  log using "stratification_sims$S_DATE.txt", append text name("Second Attempt")
                  
                  local sample 100 300 500 700 900 1100 1300 1500 1700 1900 2100 2300 2500 2700 2900 3100                            //set up local for different sample sizes
                  
                  postfile buffer pvaltre pvalcon str10 lhs str30 strata size iter using "sim_$S_DATE.dta", replace                               //setting up a buffer file for exporting results
                  
                  timer clear 1
                  timer on 1                                                                                                                                                                     //setting a timer
                  forval iter =1/3000{                                                                                                                                                       //number of times the simulation is run
                      foreach size of local sample{
                       preserve
                          set obs `size'                                                                                                                                                        //set different sample sizes
                          gen id = _n                                                                                                                                                           //generate a unique id
                          local num 6                                                                                                                                                          //setting up local for number of obs and unobs
                          forval i = 1/`num'{
                              if `i' == 1              local fcn round(runiform(),1)                                                                                                //for a randomly-drawn binary var w expected baseline 0.50
                              else if `i' == 2           local fcn round(rbeta(3,0.9),1)                                                                                        //for a binary var w expected baseline 0.10
                              else if `i' == 3           local fcn round(rbeta(3,0.5),1)                                                                                        //for a binar var w expected baseline 0.05     
                              else if `i' == 4         local fcn mod(_n,3)                                                                                                           //for a categorical var with 3 cats
                              else if `i' == 5           local fcn rnormal(), nq(5)                                                                                                //for a quintile split based on a normally-distributed var 
                              else if `i' == 6        local fcn rnormal(), nq(2)                                                                                                   //for a twoway split based on a normally-distributed var
                              if `i'<= 4                local cmd generate
                              else                    local cmd xtile
                              `cmd' obs`i' = `fcn'                                                                                                                                          //generating our observables and unobservables
                              `cmd' unobs`i' = `fcn'
                          }
                              ds obs1 obs2 obs3 obs4 obs5 obs6, skip(1)
                              local stratlist "`r(varlist)'"                                                                                                                                //store stratification vars in a local
                              local n : word count `stratlist'
                              
                              forval stratnum = 1/`n'{
                                  gettoken build stratlist : stratlist                                                                                                             //see help gettoken: this takes the first token of stratlist out of stralist and puts it in build
                                  egen strata=group(`build')                                                                                                                     //gen a variable that makes unique groups based on strat vars
                                  set seed 31540                                                                                                                                     //setting a seed for replicability
                                  gen randomnum = runiform()                                                                                                                //gen a random number
                                  bysort strata: egen order=rank(randomnum)                                                                                        //gen a rank order var based on the random number
                                  bysort strata: gen treat = (order <= _N/2)                                                                                             //assigning condition based on rank
                                  
                                  ds unobs1 unobs2 unobs3 unobs4 unobs5 unobs6, skip(1)
                                  local unobs "`r(varlist)'"                                                                                                                     //store unobservables in a local
                                  local variables `build' `unobs'                                                                                                            //putting observables and unobservables in one local
                                  foreach var of local variables{                                                                                                           //balance test
                                      reg `var' treat, r                                                                                                                             //regressing covariate on treatment and testing for imbalance
                                      matrix A = r(table)                                                                                                                         //convert regression results to matrix
                                      post buffer (A[4,1]) (A[4,2]) ("`var'") ("`build'") (`size') (`iter')                                                         //putting p-values, sample size, stratification factor set, and independent variable in buffer
                                  }    
                                  drop strata randomnum order treat
                                  
                              //The way the code is structured, I can only ever stratify on one variable at a time in the above loop. The next loop, then, 
                              //permits us to build the stratification factor set up so we can stratify on more than 1 observable at a time. 
                              
                              foreach next of local stratlist {
                                  local build `build' `next'
                                  egen strata=group(`build')                                                                                                                 //gen a variable that makes unique groups based on strat vars
                                  set seed 31540                                                                                                                                 //setting a seed for replicability
                                  gen randomnum = runiform()                                                                                                            //gen a random number
                                  bysort strata: egen order=rank(randomnum)                                                                                    //gen a rank order var based on the random number
                                  bysort strata: gen treat = (order <= _N/2)                                                                                         //assigning condition based on rank
                                  
                                  qui ds unobs1 unobs2 unobs3 unobs4 unobs5 unobs6, skip(1)
                                  local unobs "`r(varlist)'"
                                  local variables `build' `unobs'                                                                                                           //putting observables and unobservables in one local
                                  foreach var of local variables{                                                                                                          //balance test
                                      reg `var' treat, r                                                                                                                            //regressing covariate on treatment and testing for imbalance
                                      matrix A = r(table)                                                                                                                        //convert regression results to matrix
                                      post buffer (A[4,1]) (A[4,2]) ("`var'") ("`build'") (`size') (`iter')                                                        //putting p-values, sample size, stratification factor set, and independent variable in buffer
                                  }
                                  drop strata randomnum order treat
                              }
                              gettoken first build : build
                              local stratlist `build' `first'
                          }                                                                                                                                                               //looping through stratification vars and adding a new one on each iteration
                       restore
                      }
                      di as text "Have finished iteration" `"`iter'"'
                  }
                  postclose buffer
                  timer off 1                                                                                                                                                       //stopping timer
                  timer list 1

                  Comment

                  Working...
                  X