Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Strange results on using for values loops in STATA

    Hi,
    I am learning STATA programming and as a practice exercise I wrote a loop to make/create squares for the first 20 numbers. And I got some very strange results and would appreciate some light on this as to why it is happening.
    HTML Code:
    . forval y=1/20{
      2. gen square`y'=`y'^2
      3. di square`y'
      4. }
    1.654e-19
    4.398e+21
    9.225e-39
    7.648e-37
    2.598e-29
    2.755e-40
    2.5
    4.661e-32
    3.674e-39
    1.948e-20
    2.310e-41
    6.240e-39
    5.323e-18
    0
    1.766e-38
    1.795e-38
    1.771e-38
    5.177e-18
    0
    0
    I tried to trace the error and see if STATA was reading fractions or something. So here go the results of the trace
    HTML Code:
     forval y=1/20{
    - gen square`y'=`y'^2
    = gen square1=1^2
    - di square`y'
    = di square1
    1.654e-19
    - }
    - gen square`y'=`y'^2
    = gen square2=2^2
    - di square`y'
    = di square2
    4.398e+21
    - }
    - gen square`y'=`y'^2
    = gen square3=3^2
    - di square`y'
    = di square3
    9.225e-39
    - }
    - gen square`y'=`y'^2
    = gen square4=4^2
    - di square`y'
    = di square4
    7.648e-37
    - }
    - gen square`y'=`y'^2
    = gen square5=5^2
    - di square`y'
    = di square5
    2.598e-29
    - }
    - gen square`y'=`y'^2
    = gen square6=6^2
    - di square`y'
    = di square6
    2.755e-40
    - }
    - gen square`y'=`y'^2
    = gen square7=7^2
    - di square`y'
    = di square7
    2.5
    - }
    - gen square`y'=`y'^2
    = gen square8=8^2
    - di square`y'
    = di square8
    4.661e-32
    - }
    - gen square`y'=`y'^2
    = gen square9=9^2
    - di square`y'
    = di square9
    3.674e-39
    - }
    - gen square`y'=`y'^2
    = gen square10=10^2
    - di square`y'
    = di square10
    1.948e-20
    - }
    - gen square`y'=`y'^2
    = gen square11=11^2
    - di square`y'
    = di square11
    2.310e-41
    - }
    - gen square`y'=`y'^2
    = gen square12=12^2
    - di square`y'
    = di square12
    6.240e-39
    - }
    - gen square`y'=`y'^2
    = gen square13=13^2
    - di square`y'
    = di square13
    5.323e-18
    - }
    - gen square`y'=`y'^2
    = gen square14=14^2
    - di square`y'
    = di square14
    0
    - }
    - gen square`y'=`y'^2
    = gen square15=15^2
    - di square`y'
    = di square15
    1.766e-38
    - }
    - gen square`y'=`y'^2
    = gen square16=16^2
    - di square`y'
    = di square16
    1.795e-38
    - }
    - gen square`y'=`y'^2
    = gen square17=17^2
    - di square`y'
    = di square17
    1.771e-38
    - }
    - gen square`y'=`y'^2
    = gen square18=18^2
    - di square`y'
    = di square18
    5.177e-18
    - }
    - gen square`y'=`y'^2
    = gen square19=19^2
    - di square`y'
    = di square19
    0
    - }
    - gen square`y'=`y'^2
    = gen square20=20^2
    - di square`y'
    = di square20
    0
    - }
    Would appreciate inputs as to why the program is giving these weird numbers.
    Thanks a lot.
    Shalom

  • #2
    I can't reproduce this -- or explain what is wrong with your code. I get this:


    Code:
    1
    4
    9
    16
    25
    36
    49
    64
    81
    100
    121
    144
    169
    196
    225
    256
    289
    324
    361
    400

    Comment


    • #3
      If I start Stata fresh, I get 0s all the way down. If I -clear- and run it again, I get Nick's output. Not clear why these should be different, but if there are no observations, what does it mean to -generate- a variable?

      I find the following behaviour non-intuitive:

      Code:
      . sum square20
      
          Variable |        Obs        Mean    Std. Dev.       Min        Max
      -------------+---------------------------------------------------------
          square20 |          0
      
      . di square20
      400

      Comment


      • #4
        A little more experimenting finds other inconsistent behaviour. If I -set obs 1-, then immediately -clear-, then run the loop, I get 20 missing values. After that, I have to restart Stata in order to get the 0s or the correct output, with no obs.

        Comment


        • #5
          I can't exactly reproduce any of the above problems, per se. But I think a bug has been stumbled upon.

          In a completely fresh Stata 16 session, I get all zeros as output. Notice that the number of observations is exactly zero. If there are no observations, then no value should be stored and I would expect the contents of the variables to actually be system missing (.).

          Code:
          . forval y = 1/20 {
            2.   gen square`y' = `y'^2
            3.   di `y', `y'^2, square`y'
            4. }
          1 1 0
          2 4 0
          3 9 0
          4 16 0
          5 25 0
          6 36 0
          7 49 0
          8 64 0
          9 81 0
          10 100 0
          11 121 0
          12 144 0
          13 169 0
          14 196 0
          15 225 0
          16 256 0
          17 289 0
          18 324 0
          19 361 0
          20 400 0
          
          . di square5
          0
          
          . summ square5
          
              Variable |        Obs        Mean    Std. Dev.       Min        Max
          -------------+---------------------------------------------------------
               square5 |          0
          
          . list square5
          .
          Of course, if I set observations to one (-set obs 1-) then the program executes as Shalom had intended.

          Trying again in a fresh session, I run the same code once, explicitly set the number of observations and then drop my variables, the value of missing is now displayed.

          Code:
          forval y = 1/20 {
            gen square`y' = `y'^2
            di `y', `y'^2, square`y'
          }
          di square5
          summ square5
          list square5
          
          set obs 4
          drop _all
          
          forval y = 1/20 {
            gen square`y' = `y'^2
            di `y', `y'^2, square`y'
          }
          di square5
          summ square5
          list square5
          Whose output in the second half is:
          Code:
          1 1 .
          2 4 .
          3 9 .
          4 16 .
          5 25 .
          6 36 .
          7 49 .
          8 64 .
          9 81 .
          10 100 .
          11 121 .
          12 144 .
          13 169 .
          14 196 .
          15 225 .
          16 256 .
          17 289 .
          18 324 .
          19 361 .
          20 400 .
          
          . di square5
          .
          
          . summ square5
          
              Variable |        Obs        Mean    Std. Dev.       Min        Max
          -------------+---------------------------------------------------------
               square5 |          0
          
          . list square5
          .
          But the behaviour is difference once again. In a fresh session, I set the observations, run the same block of code, then drop my variables and run my code again, and Stata has tricked itself into storing (at least) one observation.

          Code:
          set obs 1
          forval y = 1/20 {
            gen square`y' = `y'^2
            di `y', `y'^2, square`y'
          }
          di square5
          summ square5
          list square5
          
          clear *
          
          forval y = 1/20 {
            gen square`y' = `y'^2
            di `y', `y'^2, square`y'
          }
          di square5
          summ square5
          list square5
          whose output is:

          Code:
          1 1 1
          2 4 4
          3 9 9
          4 16 16
          5 25 25
          6 36 36
          7 49 49
          8 64 64
          9 81 81
          10 100 100
          11 121 121
          12 144 144
          13 169 169
          14 196 196
          15 225 225
          16 256 256
          17 289 289
          18 324 324
          19 361 361
          20 400 400
          
          . di square5
          25
          
          . summ square5
          
              Variable |        Obs        Mean    Std. Dev.       Min        Max
          -------------+---------------------------------------------------------
               square5 |          0
          
          . list square5
          .
          Edit: crossed with #4.
          Last edited by Leonardo Guizzetti; 30 Nov 2020, 14:21.

          Comment


          • #6
            Code:
            . * Stata/IC 14.2 for Windows (64-bit x86-64) Revision 29 Jan 2018
            . 
            . clear all
            
            . 
            . forval y=1/2 {
              2.     
            .         gen square`y'=`y'^2
              3.     di square`y' 
              4. }
            0
            0
            Code:
            . * Stata/SE 15.1 for Windows (64-bit x86-64) Revision 03 Feb 2020
            . 
            . clear all
            
            . 
            . forval y=1/2 {
              2.     
            .         gen square`y'=`y'^2
              3.     di square`y' 
              4. }
            0
            0

            Comment


            • #7
              This is weird. Its funny how all of us have gotten different answers. I think the gist is that it depends whats been happening in the global environment prior to the code.
              Thanks all.

              Comment


              • #8
                This is both a feature and a potential bug.

                First, it is a feature of Stata that you can instantiate a variable without having to have any observations of values for that variable. There are programming contexts in which this is useful. (Think frame post where you create a new empty frame with empty variables to which you wish to post values.)

                Another feature of Stata is that if you type

                Code:
                . display varname
                Stata will display the value of observation 1 of varname.

                Shalom Patole wrote code which (I assume mistakenly) tried to generate variables containing values without first making sure to set obs to at least 1, meaning that that code couldn't actually store anything.

                Then comes the bug -- again, if you try to display varname for a variable with no observations, Stata still tries to display whatever is in memory for observation 1. That's a mistake, because there is no observation 1, and thus whatever is in that location in computer memory is returned/displayed. What really should be returned/displayed is a missing value.

                I believe this has existed in Stata forever. I am concerned that fixing it could have speed implications, so we'll have to look at it carefully.

                In any case, the solution is -- when you want to store values in a variable and you are creating that variable in an empty dataset, make sure to set obs first or there is nowhere for Stata to store values for that variable.

                By the way, Shalom Patole, one thing you want to try to do is always take advantage of Stata's inherent vectorization capabilities. That is, you don't need to write a loop to get the squares of the first 20 numbers. Instead, the best way to do this would be to

                Code:
                set obs 20
                gen squares = _n^2
                list squares
                Also, if you do want to loop for some reason to create separate single values of things, scalars (see help scalar) or local macros (see help local) might be a better choice than variables. Variables are best used for storing columns of values rather than single values.

                Code:
                forvalues y=1/20 {
                    scalar square`y' = `y'^2
                    display square`y'
                }
                
                forvalues y=1/20 {
                    local square`y' = `y'^2
                    display `square`y''
                }
                ​​​​​​​

                Comment


                • #9
                  Alan Riley (StataCorp) Thanks a lot. I was just trying to teach myself STATA programming when I stumbled on this error. Thanks a lot. Appreciate the detailed explanation.

                  Comment

                  Working...
                  X