Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Computing accurately exp(x)-1

    Stata and Mata have no expm1(x) function, i.e. a function which computes the function exp(x)-1.
    Computing exp(x)-1 suffers from cancellation errors for small values of x and will be inaccurate in this case (see in particular this paper https://cran.r-project.org/web/packa...1mexp-note.pdf).
    This function is in particular useful to compute accurately 1-exp(-x) which looks like the invcloglog functon (1-exp(-exp(x)). We can compute 1-exp(-x)=-expm1(-x).
    It also seems that the invcloglog implementation in Stata and Mata does not use this function. The mata code for invcloglog(x) basically returns 1-exp(-exp(x)) for all values, only for non-missing values of x>500 the result is set equal to 1.

    Note that the function expm1() is part of the C and C++ language as well as the function log1p().

    I have tried to implement the function myself in Mata but I am in doubt whether I am doing it right. Basically my approach is to use a Taylor expansion of order two for values less or equal to 1e-5. I am not sure whether computing more terms will actually help.

    Code:
    real matrix expm1(real matrix x)
    {
        real matrix index
    
        index = abs(x) :<= 1e-5
    
        return(!index:*(exp(x):-1):+index:*(x+0.5*x:^2))
    
    }

    I have read somewhere else that one can use exp(x/2)*sinh(x/2) for values of x between epsilon()/2 and log(2) (where epsilon is the machine precision) and for values less than epsilon/2 then expm1 = x.
    But I don't know which approach is right, if there is any.
    I am no expert in this matter and I would very much like to benefit from your insights about this issue.
    Does someone know which approximation I should use (as well as the thresholds ) or could give som advice or inputs in how to compute this function?
    Best
    Christophe
    Last edited by Christophe Kolodziejczyk; 27 Feb 2017, 06:57.

  • #2
    One question: what are you doing with the output afterwards? Because if you are computing several times and for a different purpose (e.g. to take the average), then there might be alternative tricks (such as scaling x before computing exp() );

    Comment


    • #3
      Thanks Sergio for your reply. It is use in the computation of a likelihood function and its gradient. The likelihood function has a panel-data structure. So I have to sum over the likelihood contributions for each individual.

      Comment


      • #4
        You are probably aware of this (and it's not exactly for computing 1-exp()), but in any case it might be useful:

        http://www.xarg.org/2016/06/the-log-...hine-learning/


        In a project I had to compute something very similar "logl = log(mean(exp(logl_conditional)))" , and realized that of course this gives massive overflow and accuracy issues.

        Thus, a good trick is to do:

        log( sum[exp(x_i)]) = a + log( sum[exp(x_i-a)])

        Where a = max(x_i) , so the greatest value of exp(..) is at exp(a-a)=exp(0)=1

        Comment


        • #5
          Thanks Sergio. Actually I didn't know this trick and I think it might be useful to my problem. So I am glad and thankful that you mentionned it. I have a mixture model where each contribution of the likelihood function has the following form

          \[ \log(l) = \log (\sum_k \pi_{k} f_{k}) = \log \sum_k \exp(\log f_{k} + \log \pi_{k}) \]

          So I guess I have to use this trick here.
          Last edited by Christophe Kolodziejczyk; 03 Mar 2017, 05:51.

          Comment


          • #6
            After some thoughts about the "log Sum of Exp" trick I think one could experience some problems if the sum is close to 1. Actually by applying this trick we compute log(1+something). If this something is close to zero I was wondering whether there might be numerical/precision issues since computing log(1+x) for x close to zero can be inaccurate. In this case a function like log1p() would come in handy.

            Comment


            • #7
              Here are some further thoughts about this problem. The challenge is to find a good approximation of exp()-1 for small values.
              One prototype for a expm1() function could be the following function:

              Code:
              real matrix expm1(real matrix x)
              {
                      real matrix y, out
                      real matrix index, machine
                      real scalar tol
                      
                      y = abs(x)
                      machine = y :<= epsilon(1)
                      index = x :>= -log(2) :& x :<0.45 :& y :> epsilon(1)
                      
                      out = (!index :& !machine):*(exp(x):-1)+
                      index:*editmissing(2*exp(x/2):*sinh(x/2),0) + (machine):*x
                      
                      return(out)
              }
              exp(x)-1 is therefore computed/approximated by 2*exp(x/2):*sinh(x/2) in the
              range -log(2) to 0.405 and returns x if abs(x) is less then machine precision.
              The problem with this approach is that sinh() is not defined for values < -709, where exp()
              is. So for values <-2*709 this function will return missing, which explains why I use editmissing.
              The dilemma is that for those value we want to return exp(x)-1, but because sinh() evaluates to missing
              that function would return missing if we do not take care of this problem.
              I have to say, it's inelegant and probably inefficient. Unless there is a smarter way
              to do that it tends to show in one limitation of vectorization (or my own :-)).

              According to this paper http://www.math.utah.edu/~beebe/reports/expm1.pdf
              in this range we can approximate
              exp(x)-1 by the taylor serie (x+x^2/2+x^3/6+...) where we add terms up to machine
              precision. This approximation gives results very close (virtually identical)
              to the expm1() available from the
              GNU C compiler. So I think it is a good benchmark to investigate the precision of my
              implementation of expm1().

              Code:
              real scalar expm1Taylor(real scalar x) 
              {
              
              
                      if (x==.) return(.)
                      else if (abs(x) < epsilon(1)) return(x)
                      else if (x<-log(2)|x>.405) return(exp(x)-1)
                      else {
                              real scalar i, res, x_p
                              i = 2
                              res = x
                              x_p = x^2/2
                              while (res + x_p != res)
                              {
                                      res = res + x_p
                                      ++i
                                      x_p = x^i/factorial(i)                    
                              }
                              return(res)
                      }
              }


              I give below a table with computations of exp(x)-1 in the various ways proposed in this
              post. I have computed exp(x)-1, expm1(x) and the approximation by the
              Taylor serie up to machine precision. The range includes machine precision,
              -log(2) = -0.631472, 0.405 and the range 1e-1..1e-10 both positive and negative
              numbers.

              Code:
              mata:
              
              x1 = 10:^(-(10..1))
              x2 = 10:^(-(1..10))
              x = -(epsilon(1),x1,log(2),1),0,1,0.405,x2,epsilon(1)
              x = x'
              data = exp(x):-1,expm1(x),expm1V2(x)
              
              mm_matlist(data,"%20.19f",3,strofreal(x), ("exp(x)-1","expm1(x)","Taylor"))
              end
              By eyeballing the table it seems expm1() gives a very good approximation.

              Code:
              ---------------------------------------------------------------------------------------
                          |                exp(x)-1                 expm1(x)                   Taylor
              ------------+--------------------------------------------------------------------------
                -2.22e-16 |  -0.0000000000000002220   -0.0000000000000002220   -0.0000000000000002220
                -1.00e-10 |  -0.0000000001000000083   -0.0000000001000000000   -0.0000000001000000000
                -1.00e-09 |  -0.0000000009999999717   -0.0000000009999999995   -0.0000000009999999995
                -1.00e-08 |  -0.0000000099999999392   -0.0000000099999999500   -0.0000000099999999500
                -1.00e-07 |  -0.0000000999999949514   -0.0000000999999950000   -0.0000000999999950000
                -1.00e-06 |  -0.0000009999994999843   -0.0000009999995000002   -0.0000009999995000002
                  -.00001 |  -0.0000099999500001724   -0.0000099999500001667   -0.0000099999500001667
                   -.0001 |  -0.0000999950001666639   -0.0000999950001666625   -0.0000999950001666625
                    -.001 |  -0.0009995001666249781   -0.0009995001666250087   -0.0009995001666250082
                     -.01 |  -0.0099501662508318933   -0.0099501662508319488   -0.0099501662508319488
                      -.1 |  -0.0951625819640404820   -0.0951625819640404270   -0.0951625819640404270
                -.6931472 |  -0.5000000000000000000   -0.5000000000000000000   -0.5000000000000000000
                       -1 |  -0.6321205588285576700   -0.6321205588285576700   -0.6321205588285576700
                        0 |     0.0000000000000e+00      0.0000000000000e+00      0.0000000000000e+00
                        1 |     1.7182818284590e+00      1.7182818284590e+00      1.7182818284590e+00
                     .405 |   0.4993025000567670200    0.4993025000567669100    0.4993025000567669100
                       .1 |   0.1051709180756477100    0.1051709180756476300    0.1051709180756476300
                      .01 |   0.0100501670841679490    0.0100501670841680580    0.0100501670841680600
                     .001 |   0.0010005001667083846    0.0010005001667083419    0.0010005001667083419
                    .0001 |   0.0001000050001667141    0.0001000050001666709    0.0001000050001666709
                   .00001 |   0.0000100000500000696    0.0000100000500001667    0.0000100000500001667
                 1.00e-06 |   0.0000010000004999622    0.0000010000005000002    0.0000010000005000002
                 1.00e-07 |   0.0000001000000049434    0.0000001000000050000    0.0000001000000050000
                 1.00e-08 |   0.0000000099999999392    0.0000000100000000500    0.0000000100000000500
                 1.00e-09 |   0.0000000010000000827    0.0000000010000000005    0.0000000010000000005
                 1.00e-10 |   0.0000000001000000083    0.0000000001000000000    0.0000000001000000000
                 2.22e-16 |   0.0000000000000002220    0.0000000000000002220    0.0000000000000002220
              ---------------------------------------------------------------------------------------
              I hope this will be useful. Feel free to comment.
              Some similar thoughts or considerations should be given to the computation of log(1+x).

              note:
              The function expm1V2 is

              Code:
              real matrix expm1V2(real matrix x)
              {
                      real scalar i, j, N, M
                      real matrix out
                      N = rows(x); M = cols(x)
                      
                      out = J(N,M,.)
                      
                      for (i = 1; i <= N ; i++)
                      {
                              for (j = 1; j <= M ; j++)
                              {
                                          out[i,j] = expm1Taylor(x[i,j])
                              }
                      }
                      
                      return(out)
              }



              Comment


              • #8
                This interested but non-expert bystander asks: are some of the references cited in the thread at http://andrewgelman.com/2016/06/11/l...-exponentials/ useful for thinking about the issues raised here?

                Comment


                • #9
                  Stephen Jenkins Thanks for the link. It is not directly useful for computing exp(x)-1, as it does not involve computing the log of a sum of exponential functions. It is useful though for an other aspect of my problem, since I am trying to estimate a mixture model, which involves some computations on the log scale (see posts # 5 and 6).
                  Actually I am trying to estimate a discrete duration model in discrete time and I need expm1() in order to compute log(1-exp(x))

                  The functions expm1() and log1p() are available in other languages (R, Matlab, Julia, python, C). So I don't understand why these functions are are not available in Stata/Mata. Maybe they will be available in future versions of Stata (15?).

                  Comment


                  • #10
                    FYI, the following functions have been added (for both Stata and Mata) in the latest Stata 15 update (available since August 7, 2018).

                    1. expm1(x) returns ex - 1 with higher precision than exp(x)-1 for small values of |x|.

                    2. ln1p(x) and log1p(x) return the natural logarithm of 1+x with higher precision than ln(1+x) for small values of |x|.

                    3. ln1m(x) and log1m(x) return the natural logarithm of 1-x with higher precision than ln(1-x) for small values of |x|.

                    -- Kreshna

                    Comment


                    • #11
                      That's good news. To the interesting (even entertaining) links here I can add https://www.johndcook.com/blog/2010/...m-unnecessary/

                      I quite often need sign(x) * log(1 + abs(x)) which presumably is now a little easier. For that is almost log(x) for x >> 0 and -log(- x) for x << 0 and behaves smoothly around x = 0. It's the simplest relative of logarithms I know that does what I regard as the right thing for variables like profit-loss or glacier terminus position change. (I know about asinh().)

                      Comment


                      • #12
                        Originally posted by Kreshna Gopal (StataCorp) View Post
                        FYI, the following functions have been added (for both Stata and Mata) in the latest Stata 15 update (available since August 7, 2018).
                        Thank you, Kreshna. This is timely, as I just wrote a command to fit Stukel's generalized logistic regression, which uses these functions to compute the log likelihood. I had written Taylor series approximations. It was a bit of a diversion, but not too bad. Now I can just substitute these, which makes for much cleaner code.

                        Comment

                        Working...
                        X