Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Rounding errors

    Hi,

    I am getting rounding errors. All the variables are doubles. Any advice?

    This is the output of my commands:

    Code:
    . drop mx er
    . egen double mx=mean(x), by(firm)
    . egen double er=sd(mx), by(firm)
    . tab er
    
             er |      Freq.     Percent        Cum.
    ------------+-----------------------------------
       4.57e-19 |         10       10.00       10.00
       9.10e-19 |         11       11.00       21.00
       3.64e-18 |         11       11.00       32.00
       1.43e-17 |         16       16.00       48.00
       2.87e-17 |         15       15.00       63.00
       3.00e-17 |          7        7.00       70.00
       5.89e-17 |          9        9.00       79.00
       1.20e-16 |          7        7.00       86.00
       2.30e-16 |         14       14.00      100.00
    ------------+-----------------------------------
          Total |        100      100.00
    
    .

    This is the data for running the above commands:


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long firm double(x mx er)
      8    .06598379746539487 -.00780740727273633  9.096966653664784e-19
      8  -.024296468006011936 -.00780740727273633  9.096966653664784e-19
      8  -.016538908357046193 -.00780740727273633  9.096966653664784e-19
      8  .0007838766547200373 -.00780740727273633  9.096966653664784e-19
      8   -.01186241715373642 -.00780740727273633  9.096966653664784e-19
      8   -.04399931143000155 -.00780740727273633  9.096966653664784e-19
      8    .05094462127333959 -.00780740727273633  9.096966653664784e-19
      8    -.0209074197248867 -.00780740727273633  9.096966653664784e-19
      8   -.05730159554088188 -.00780740727273633  9.096966653664784e-19
      8  -.024326386136656532 -.00780740727273633  9.096966653664784e-19
      8  -.004361269044332926 -.00780740727273633  9.096966653664784e-19
     34    .16978422920560407  .01900911081256544 3.6387866614659135e-18
     34   .008807978431772846  .01900911081256544 3.6387866614659135e-18
     34   -.07377585983258486  .01900911081256544 3.6387866614659135e-18
     34  -.003909315418006177  .01900911081256544 3.6387866614659135e-18
     34  .0009074709114859029  .01900911081256544 3.6387866614659135e-18
     34   .011015346560230664  .01900911081256544 3.6387866614659135e-18
     34     .0186334289209487  .01900911081256544 3.6387866614659135e-18
     34   .009111509198169774  .01900911081256544 3.6387866614659135e-18
     34   .027842131519310274  .01900911081256544 3.6387866614659135e-18
     34   .023477543664350838  .01900911081256544 3.6387866614659135e-18
     34    .01720575577693781  .01900911081256544 3.6387866614659135e-18
     41    .05074429289625524 -.05025646033331739 1.4332917616497526e-17
     41  -.045861345317417514 -.05025646033331739 1.4332917616497526e-17
     41  -.025056767861123246 -.05025646033331739 1.4332917616497526e-17
     41  -.029143476675561747 -.05025646033331739 1.4332917616497526e-17
     41    -.0656295971430266 -.05025646033331739 1.4332917616497526e-17
     41   -.08688247649568971 -.05025646033331739 1.4332917616497526e-17
     41   -.09010641692292877 -.05025646033331739 1.4332917616497526e-17
     41   -.07188997071972471 -.05025646033331739 1.4332917616497526e-17
     41   -.08476573704229834 -.05025646033331739 1.4332917616497526e-17
     41   -.09656890892155408 -.05025646033331739 1.4332917616497526e-17
     41   -.08220457558491083 -.05025646033331739 1.4332917616497526e-17
     41   -.06486638087287507 -.05025646033331739 1.4332917616497526e-17
     41   -.03892723701008416 -.05025646033331739 1.4332917616497526e-17
     41     -.115637922149492 -.05025646033331739 1.4332917616497526e-17
     41  -.030798132465798326 -.05025646033331739 1.4332917616497526e-17
     41    .07349128695315157 -.05025646033331739 1.4332917616497526e-17
     71     .1769707781479782  .34111923988929627  5.887846720064156e-17
     71     .5841515261906685  .34111923988929627  5.887846720064156e-17
     71     .5916601477300818  .34111923988929627  5.887846720064156e-17
     71    .47067465783098045  .34111923988929627  5.887846720064156e-17
     71     .7406450072677513  .34111923988929627  5.887846720064156e-17
     71     .4247141624439314  .34111923988929627  5.887846720064156e-17
     71   .052851697100199234  .34111923988929627  5.887846720064156e-17
     71   .032885132539892836  .34111923988929627  5.887846720064156e-17
     71  -.004479950247817374  .34111923988929627  5.887846720064156e-17
     93   -.03288070910971527 .002322223511140183   4.57139774554258e-19
     93  -.003000058999284944 .002322223511140183   4.57139774554258e-19
     93   -.00452518392621522 .002322223511140183   4.57139774554258e-19
     93 -.0016404269656765291 .002322223511140183   4.57139774554258e-19
     93  -.001713828954350029 .002322223511140183   4.57139774554258e-19
     93   .025577143045100652 .002322223511140183   4.57139774554258e-19
     93  -.006816318899040713 .002322223511140183   4.57139774554258e-19
     93   .006439043335823452 .002322223511140183   4.57139774554258e-19
     93    .01136071517308614 .002322223511140183   4.57139774554258e-19
     93   .030421860411674304 .002322223511140183   4.57139774554258e-19
    105   -2.0084331868342176  -.6894567643528704  1.199177923332954e-16
    105   -1.6169004320440667  -.6894567643528704  1.199177923332954e-16
    105   -1.5631009398228874  -.6894567643528704  1.199177923332954e-16
    105    -.4657189186955807  -.6894567643528704  1.199177923332954e-16
    105   .024736246715305225  -.6894567643528704  1.199177923332954e-16
    105     .0614557536888581  -.6894567643528704  1.199177923332954e-16
    105     .7417641265224962  -.6894567643528704  1.199177923332954e-16
    149     -.559829533959755 -.21158715250749163  2.997944808332385e-17
    149   .022628342080495946 -.21158715250749163  2.997944808332385e-17
    149   -1.1253340686205036 -.21158715250749163  2.997944808332385e-17
    149    .13473484851721745 -.21158715250749163  2.997944808332385e-17
    149     .2838785386033131 -.21158715250749163  2.997944808332385e-17
    149     .4612642731979791 -.21158715250749163  2.997944808332385e-17
    149    -.6984524673711883 -.21158715250749163  2.997944808332385e-17
    205    .29428284384604686  1.9611390633253616  2.304265763365491e-16
    205    .12076176451634184  1.9611390633253616  2.304265763365491e-16
    205     .3538132839527904  1.9611390633253616  2.304265763365491e-16
    205     .2843617452255948  1.9611390633253616  2.304265763365491e-16
    205    1.5729192625708237  1.9611390633253616  2.304265763365491e-16
    205     2.612625486484341  1.9611390633253616  2.304265763365491e-16
    205    3.7099652376541177  1.9611390633253616  2.304265763365491e-16
    205     3.230853979356402  1.9611390633253616  2.304265763365491e-16
    205    1.7015417346145227  1.9611390633253616  2.304265763365491e-16
    205    1.5290911132617329  1.9611390633253616  2.304265763365491e-16
    205    1.6798942538068127  1.9611390633253616  2.304265763365491e-16
    205    2.3003404712394406  1.9611390633253616  2.304265763365491e-16
    205     3.260536504880284  1.9611390633253616  2.304265763365491e-16
    205     4.804959205145811  1.9611390633253616  2.304265763365491e-16
    265    .09885976644861236 -.38668431543804016  5.661048867003676e-17
    265    -.5489907267229508 -.38668431543804016  5.661048867003676e-17
    265    .12703252940125062 -.38668431543804016  5.661048867003676e-17
    265     .2660667444447353 -.38668431543804016  5.661048867003676e-17
    265 -.0020228621510010327 -.38668431543804016  5.661048867003676e-17
    265   -.06698536451296297 -.38668431543804016  5.661048867003676e-17
    265    -.3578015134053096 -.38668431543804016  5.661048867003676e-17
    265    -.5034994407075265 -.38668431543804016  5.661048867003676e-17
    265    -.5112500462215995 -.38668431543804016  5.661048867003676e-17
    265   -.44699880871118985 -.38668431543804016  5.661048867003676e-17
    265  -.004900356455361212 -.38668431543804016  5.661048867003676e-17
    265    .20488041269165932 -.38668431543804016  5.661048867003676e-17
    265    .16386868606454658 -.38668431543804016  5.661048867003676e-17
    265    .03426959033093182 -.38668431543804016  5.661048867003676e-17
    265   -.17472596319849246 -.38668431543804016  5.661048867003676e-17
    end
    label values firm firm
    label def firm 8 "00081T108", modify
    label def firm 34 "001084102", modify
    label def firm 41 "001547108", modify
    label def firm 71 "00207R101", modify
    label def firm 93 "002567105", modify
    label def firm 105 "003654100", modify
    label def firm 149 "00484M106", modify
    label def firm 205 "00738A106", modify
    label def firm 265 "007903107", modify

  • #2
    Well, the "rounding errors" you are getting are only in the way that -tab- displays things, and, unfortunately, you have no control over that. So you have to, in effect, write your own little tab routine. Fortunately, that is easy. Also, you have to pay attention to the display format of the variables

    Code:
    format mx er %12.6e
    
    contract er, freq(freq) percent(percent) cfreq(cum)
    order cum, last
    list, noobs clean

    Comment


    • #3
      Hi Clyde,
      Thanks for responding.
      No, in fact, my -tab- was only to show the SDs in the group mean to the list. I am not interested in the "er" variable at all.
      If you input the "mx" variable in a -xthtaylor- "const" option, it will crash with an error "mx is not constant in panels" or something like that.
      My point is: if I am taking the mean for each firm across years, the SD of this mean for each firm should be == zero (not 5e-17).
      Best regards,
      Iuri.

      Comment


      • #4
        Oh, I see. I completely misunderstood your point.

        Yes, you are getting rounding errors. I've just, for the first time, looked at the code for -egen, sd()-. It calculates the standard deviations by first computing the mean in each group, then summing the squared deviations from the mean and dividing by N-1. That method is known to be vulnerable to rounding errors, and I'm sure that's what is happening here.

        The built-in standard deviation calculation in -summarize- does not seem to make this same mistake. (Although, checking the manual's methods and formulas for -summarize- it claims to rely on the same algorithm!)

        So, I'm not altogether clear on what's going on here. Suffice it to say, if you need a more precise standard deviation calculation, you should rely on summarize instead of -egen, sd()-. That can be done as follows:

        Code:
        drop er
        
        levelsof firm, local(firms)
        gen double er = .
        foreach f of local firms {
            summ mx if firm == `f'
            replace er = r(sd) if firm == `f'
        }
        assert er == 0
        Note: If your data set is very large, this code will be very slow to run. There is a faster way using -runby- (by Robert Picard and me, available from SSC).

        All of that said, are you working in a context where errors of the order of magnitude 10-16 really matter?



        Comment


        • #5
          Hi Clyde,

          I don't need the SD at all. I need the means. What I have done to get rid of the rounding errors (in the mean -- again, I don't need the SD), was to copy the first mean to the remaining values:
          Code:
          foreach v of varlist $ivars $cvars {
              egen double m_`v' = mean(`v'), by(firm)
              tempvar fmean
              by firm: g `fmean' = m_`v'[1]
              replace m_`v' = `fmean' // deals with rounding errors
              }
          Best regards,

          Iuri.

          Comment


          • #6
            Originally posted by Clyde Schechter View Post
            Oh, I see. I completely misunderstood your point.

            All of that said, are you working in a context where errors of the order of magnitude 10-16 really matter?
            I am following the code in https://www.statalist.org/forums/for...riant-variable (#9) and -xthtaylor- , constant(m_var) complains that my m_var is not constant. I am not that picky, but -xthtaylor- is! :-)

            Best,

            Iuri.
            Last edited by Iuri Gavronski; 02 Mar 2019, 18:04.

            Comment


            • #7
              This is just beyond strange. When I run
              Code:
              * Example generated by -dataex-. To install: ssc install dataex
              clear
              input long firm double(x mx er)
                8    .06598379746539487 -.00780740727273633  9.096966653664784e-19
                8  -.024296468006011936 -.00780740727273633  9.096966653664784e-19
                8  -.016538908357046193 -.00780740727273633  9.096966653664784e-19
                8  .0007838766547200373 -.00780740727273633  9.096966653664784e-19
                8   -.01186241715373642 -.00780740727273633  9.096966653664784e-19
                8   -.04399931143000155 -.00780740727273633  9.096966653664784e-19
                8    .05094462127333959 -.00780740727273633  9.096966653664784e-19
                8    -.0209074197248867 -.00780740727273633  9.096966653664784e-19
                8   -.05730159554088188 -.00780740727273633  9.096966653664784e-19
                8  -.024326386136656532 -.00780740727273633  9.096966653664784e-19
                8  -.004361269044332926 -.00780740727273633  9.096966653664784e-19
               34    .16978422920560407  .01900911081256544 3.6387866614659135e-18
               34   .008807978431772846  .01900911081256544 3.6387866614659135e-18
               34   -.07377585983258486  .01900911081256544 3.6387866614659135e-18
               34  -.003909315418006177  .01900911081256544 3.6387866614659135e-18
               34  .0009074709114859029  .01900911081256544 3.6387866614659135e-18
               34   .011015346560230664  .01900911081256544 3.6387866614659135e-18
               34     .0186334289209487  .01900911081256544 3.6387866614659135e-18
               34   .009111509198169774  .01900911081256544 3.6387866614659135e-18
               34   .027842131519310274  .01900911081256544 3.6387866614659135e-18
               34   .023477543664350838  .01900911081256544 3.6387866614659135e-18
               34    .01720575577693781  .01900911081256544 3.6387866614659135e-18
               41    .05074429289625524 -.05025646033331739 1.4332917616497526e-17
               41  -.045861345317417514 -.05025646033331739 1.4332917616497526e-17
               41  -.025056767861123246 -.05025646033331739 1.4332917616497526e-17
               41  -.029143476675561747 -.05025646033331739 1.4332917616497526e-17
               41    -.0656295971430266 -.05025646033331739 1.4332917616497526e-17
               41   -.08688247649568971 -.05025646033331739 1.4332917616497526e-17
               41   -.09010641692292877 -.05025646033331739 1.4332917616497526e-17
               41   -.07188997071972471 -.05025646033331739 1.4332917616497526e-17
               41   -.08476573704229834 -.05025646033331739 1.4332917616497526e-17
               41   -.09656890892155408 -.05025646033331739 1.4332917616497526e-17
               41   -.08220457558491083 -.05025646033331739 1.4332917616497526e-17
               41   -.06486638087287507 -.05025646033331739 1.4332917616497526e-17
               41   -.03892723701008416 -.05025646033331739 1.4332917616497526e-17
               41     -.115637922149492 -.05025646033331739 1.4332917616497526e-17
               41  -.030798132465798326 -.05025646033331739 1.4332917616497526e-17
               41    .07349128695315157 -.05025646033331739 1.4332917616497526e-17
               71     .1769707781479782  .34111923988929627  5.887846720064156e-17
               71     .5841515261906685  .34111923988929627  5.887846720064156e-17
               71     .5916601477300818  .34111923988929627  5.887846720064156e-17
               71    .47067465783098045  .34111923988929627  5.887846720064156e-17
               71     .7406450072677513  .34111923988929627  5.887846720064156e-17
               71     .4247141624439314  .34111923988929627  5.887846720064156e-17
               71   .052851697100199234  .34111923988929627  5.887846720064156e-17
               71   .032885132539892836  .34111923988929627  5.887846720064156e-17
               71  -.004479950247817374  .34111923988929627  5.887846720064156e-17
               93   -.03288070910971527 .002322223511140183   4.57139774554258e-19
               93  -.003000058999284944 .002322223511140183   4.57139774554258e-19
               93   -.00452518392621522 .002322223511140183   4.57139774554258e-19
               93 -.0016404269656765291 .002322223511140183   4.57139774554258e-19
               93  -.001713828954350029 .002322223511140183   4.57139774554258e-19
               93   .025577143045100652 .002322223511140183   4.57139774554258e-19
               93  -.006816318899040713 .002322223511140183   4.57139774554258e-19
               93   .006439043335823452 .002322223511140183   4.57139774554258e-19
               93    .01136071517308614 .002322223511140183   4.57139774554258e-19
               93   .030421860411674304 .002322223511140183   4.57139774554258e-19
              105   -2.0084331868342176  -.6894567643528704  1.199177923332954e-16
              105   -1.6169004320440667  -.6894567643528704  1.199177923332954e-16
              105   -1.5631009398228874  -.6894567643528704  1.199177923332954e-16
              105    -.4657189186955807  -.6894567643528704  1.199177923332954e-16
              105   .024736246715305225  -.6894567643528704  1.199177923332954e-16
              105     .0614557536888581  -.6894567643528704  1.199177923332954e-16
              105     .7417641265224962  -.6894567643528704  1.199177923332954e-16
              149     -.559829533959755 -.21158715250749163  2.997944808332385e-17
              149   .022628342080495946 -.21158715250749163  2.997944808332385e-17
              149   -1.1253340686205036 -.21158715250749163  2.997944808332385e-17
              149    .13473484851721745 -.21158715250749163  2.997944808332385e-17
              149     .2838785386033131 -.21158715250749163  2.997944808332385e-17
              149     .4612642731979791 -.21158715250749163  2.997944808332385e-17
              149    -.6984524673711883 -.21158715250749163  2.997944808332385e-17
              205    .29428284384604686  1.9611390633253616  2.304265763365491e-16
              205    .12076176451634184  1.9611390633253616  2.304265763365491e-16
              205     .3538132839527904  1.9611390633253616  2.304265763365491e-16
              205     .2843617452255948  1.9611390633253616  2.304265763365491e-16
              205    1.5729192625708237  1.9611390633253616  2.304265763365491e-16
              205     2.612625486484341  1.9611390633253616  2.304265763365491e-16
              205    3.7099652376541177  1.9611390633253616  2.304265763365491e-16
              205     3.230853979356402  1.9611390633253616  2.304265763365491e-16
              205    1.7015417346145227  1.9611390633253616  2.304265763365491e-16
              205    1.5290911132617329  1.9611390633253616  2.304265763365491e-16
              205    1.6798942538068127  1.9611390633253616  2.304265763365491e-16
              205    2.3003404712394406  1.9611390633253616  2.304265763365491e-16
              205     3.260536504880284  1.9611390633253616  2.304265763365491e-16
              205     4.804959205145811  1.9611390633253616  2.304265763365491e-16
              265    .09885976644861236 -.38668431543804016  5.661048867003676e-17
              265    -.5489907267229508 -.38668431543804016  5.661048867003676e-17
              265    .12703252940125062 -.38668431543804016  5.661048867003676e-17
              265     .2660667444447353 -.38668431543804016  5.661048867003676e-17
              265 -.0020228621510010327 -.38668431543804016  5.661048867003676e-17
              265   -.06698536451296297 -.38668431543804016  5.661048867003676e-17
              265    -.3578015134053096 -.38668431543804016  5.661048867003676e-17
              265    -.5034994407075265 -.38668431543804016  5.661048867003676e-17
              265    -.5112500462215995 -.38668431543804016  5.661048867003676e-17
              265   -.44699880871118985 -.38668431543804016  5.661048867003676e-17
              265  -.004900356455361212 -.38668431543804016  5.661048867003676e-17
              265    .20488041269165932 -.38668431543804016  5.661048867003676e-17
              265    .16386868606454658 -.38668431543804016  5.661048867003676e-17
              265    .03426959033093182 -.38668431543804016  5.661048867003676e-17
              265   -.17472596319849246 -.38668431543804016  5.661048867003676e-17
              end
              label values firm firm
              label def firm 8 "00081T108", modify
              label def firm 34 "001084102", modify
              label def firm 41 "001547108", modify
              label def firm 71 "00207R101", modify
              label def firm 93 "002567105", modify
              label def firm 105 "003654100", modify
              label def firm 149 "00484M106", modify
              label def firm 205 "00738A106", modify
              label def firm 265 "007903107", modify
              
              keep firm x
              
              egen double mx = mean(x), by(firm)
              by firm (mx), sort: assert mx[1] == mx[_N]
              egen double er = sd(mx), by(firm)
              the -assert- command passes without error. In other words, all of the values of mx within firm are already equal. Replacing them all by the value of mx[1] will not change anything at all. And inspecting the code for -egen, mean()-, this is exactly what I would expect, because it works by calculating a running sum within firm, dividing the last observation by the number of non-missing observations, and then copying that result into all values for the newly created mean variable. So the code should produce identical values within firm, and my test above confirms that, indeed, it does. Moreover, if I go back and add a line of code replacing mx by mx[1] (-by firm-) and then re-calculate er, the standard deviations still come out as non-zero (as I would expect).

              Moreover, if I run
              Code:
              keep firm x
              
              egen double mx = mean(x), by(firm)
              tempfile holding
              save `holding'
              by firm, sort: replace mx = mx[1]
              cf _all using `holding'
              the -cf- command finds nothing in the file has changed. (And the -replace- command reports 0 real changes made, too.) So I can't understand how replacing mx by mx[1] changes -xthtaylor-'s behavior with this data.

              Nevertheless, it remains true that the values of er are not coming out 0: they have these small values that you have picked up. So, my conclusions in #4 are still correct: the problem arises in the calculation of standard deviations by a less-robust algorithm in -egen, sd()-, a problem which is not arising when standard deviations are calculated by -summarize-.

              The code for -xthtaylor- is long and complicated and I don't follow it (especially since I don't even know what it's supposed to do.) I searched the code for a place that throws an error message about something not being constant, but found nothing--perhaps this is being picked up in some program that -xthtaylor- itself calls. So I don't know in what way -xthtaylor- is trying to verify that something is constant; it would be surprising to learn if that were done by testing for a zero standard deviation, but not impossible.

              Anyway, I'm unable to see why your fix works.

              Comment


              • #8
                Me neither... it just worked. I would be glad to send you the file and code for running it... The only thing I might imagine is the double / float precision differences. Notice that my `fmean' is float -- and only when I used an intermediary float var it fixed the small differences in the double var.
                I normally don't use float, because of the well-known precision problems (for example, https://www.cpc.unc.edu/research/too...misc/precision), but this time it seems the double precision hit my foot...

                Comment

                Working...
                X