Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generating uniformly distributed random number for a set of variables in a foreach loop taking predefined min/max values

    I have a simple task of generating a set of uniformly distributed random number where min/max values would correspond to min/max values taken from an existing variable. Following the advice on The Stata Blog, I drafted the primitive code below:
    Code:
    /* === Example - Random numbers === */
    
    // Data
    sysuse auto, clear
    
    // Loop
    foreach var of varlist price mpg headroom trunk weight length turn {
        generate rand_`var' = floor((max(`var')-min(`var')+1)*runiform() + min(`var'))
    }
    Unfortunately the code returns the invalid syntax error. Naturally, I will be grateful for any help.
    Kind regards,
    Konrad
    Version: Stata/IC 13.1

  • #2
    The max and min functions do something different than what you think they do. You cannot pass a variable to them. To achieve what you want you can use -sum- and use the returned scalars.

    Comment


    • #3
      Aljar, thanks will do. I see that I'm picking bad habits from working more and more in R. It's a shame that it's not as straightforward as the one would expect.
      Kind regards,
      Konrad
      Version: Stata/IC 13.1

      Comment


      • #4
        Aljar is right on the main point: although max() and min() often work with variables,. they are not what you want. I'll take any rounding to integers as secondary

        Code:
         
        sysuse auto, clear
        
        foreach var of varlist price mpg headroom trunk weight length turn {
            su `var', meanonly 
            generate rand_`var' = r(min) + (r(max)- r(min)) * runiform() 
        }

        Comment


        • #5
          Thank you both for the useful comments, I redrafted my syntax:
          Code:
             
          quietly : su `var'
              generate rand_`var' = floor((`r(max)' - `r(min)'+1)*runiform() + `r(min)')
          Kind regards,
          Konrad
          Version: Stata/IC 13.1

          Comment


          • #6
            max() and min() work the way you want in Mata.

            The key to understanding this is, broadly, that

            1. Commands are primary in Stata and functions are secondary. It's quirky of Stata that they are disjoint, but there you go.

            2. R is a programming language with very strong statistical support, while Stata is a statistical language with programmability.

            It's understandable, but also bizarre, that people expect other languages to work the way that their favourite or most recent language works, even though they know that's quite unreasonable. (I'm generalising wildly here, and not making a dig at Konrad.)

            Comment


            • #7
              No bad habits, but if you want to use Stata in an R-ish way, you should probably look into Mata. Here is a sketch. No looping required at all.

              Code:
              cap pr drop rvars
              m : mata clear
              
              pr rvars
                  vers 12.1
                  
                  syntax varlist(num)
                  m : Rvars("`varlist'")
              end
              
              vers 12.1
              m :
              void Rvars(string rowvector vars)
              {
                  real matrix X, Max, Min, R
                  
                  vars = tokens(vars)
                  X = st_data(., vars)
                  Max = colmax(X)
                  Min = colmin(X)
                  
                  R = runiform(rows(X), cols(X))
                  R = floor((Max - Min :+ 1) :* R :+ Min)
                  
                  st_store(., st_addvar(c("type"), "rand_" :+ vars), R)
              }
              end
              Typing

              Code:
              rvars price mpg headroom trunk weight length turn
              yields the desired result.

              Best
              Daniel

              Comment


              • #8
                Or, for the one-line fans

                Code:
                m : st_store(., st_addvar(c("type"), "rand_" ///
                :+ tokens("price mpg headroom trunk weight length turn")), ///
                floor((colmax(st_data(., ///
                tokens("price mpg headroom trunk weight length turn"))) ///
                - colmin(st_data(., ///
                tokens("price mpg headroom trunk weight length turn"))) :+ 1) ///
                :* runiform(rows(st_data(., ///
                tokens("price mpg headroom trunk weight length turn"))), ///
                cols(st_data(., ///
                tokens("price mpg headroom trunk weight length turn")))) ///
                :+ colmin(st_data(., ///
                tokens("price mpg headroom trunk weight length turn")))))
                Best
                Daniel

                Comment


                • #9
                  I like the use of tokens, it's not commonly used syntax element and neatly emphasises the ingenuity of Stata.
                  Kind regards,
                  Konrad
                  Version: Stata/IC 13.1

                  Comment


                  • #10
                    Supposedly there was an entire program to do lineprinter maps in APL that was just one line long. The problem was (1) writing that one line (2) modifying it if you wanted something different.

                    If you haven't heard of lineprinters or APL, you should have been there.

                    Comment


                    • #11
                      For more on tokens, see http://www.statalist.org/forums/foru...t-manipulation

                      Comment


                      • #12
                        Maybe I'm missing something, but the tokens function is no longer required with st_data() or st_view(). But I like the function and vote for the Mata solution.

                        Comment

                        Working...
                        X