Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Underflow in individual terms of a fraction

    I am trying to use Mata to calculate a fraction in which each term in both the numerator and denominator is quite small (say 1e-500), so it will underflow even a double precision value and become 0. In simplest form, the fraction might be described as simply:
    \[{P_l} = \frac{{{X_l}}}{{\sum\limits_{l = l}^L {{X_l}} }}\]
    This arises as I try to implement a proposed Bayesian solution to a nonstandard approach involving categorical data. I don't know that the details of the origin of the problem will help, but I'll provide the obscure reference below for those interested. For whatever help it might be I'll say that Xl is the probability of the data assuming one particular value in a finite parameter space, and the denominator is the sum of the probabilities of the data assuming each of the L possible values of the parameter, where L is a small integer. (L = the number of response categories for each of several similar questionnaire items.)

    I have been solving the problem up until now by scaling each Xl along the way by a big fixed value (say B =1e+250), so that I'm working with \[{P_l} = \frac{{B{X_l}}}{{\sum\limits_{l = l}^L {B{X_l}} }}\]
    but that approach is no longer working for me at larger values of_N, where each Xl is much smaller and quite variable in magnitude. I don't see any helpful role for logs, given the sum in the denominator. I'm hoping/presuming that someone can point me toward a typical approach to what must be a common problem in some contexts/fields.

    (Original reference: Appendix A of Romney, A., Weller, S. , Batchelder, W. 1986. "Culture as Consensus." American Anthropologist. 88, 2: 313-338. A huge applied literature issues out of this article, but not much is said computational issues.)


    Regards, Mike
    Last edited by Mike Lacy; 15 Mar 2016, 21:54. Reason: Fix some spelling.

  • #2
    I am no expert in this, but does it help to work in logs and then exponentiate? exp[log(P_l)] = exp[ log(X_l) - log (SUM X_l) ], etc.

    Whatever, 1e-500 sounds very small ...

    Comment


    • #3
      Thanks for taking a look at this. The strategy you're thinking of here is along lines I've tried. I'm having problems, though, because when I log the indivdual terms, I still get into trouble when exponentiate one of them. Let's say, to take a silly example with just two terms in the denominator sum:

      1e-500/(1e-500 + 2e-500) = 0.333

      I could certainly store individual terms as logs, but I can't see how I can get the sum in the denominator.

      I subsequently did a search with "Bayesian" and "underflow" among the keys, and found a suggested technique in which the code looks at all the terms you are dealing with and figures out the appropriate scale factor to use in a given empirical instance. So here, scaling by 1e+500 would work. Perhaps that approach can be combined here with using logs, so that instead of multiplicative scaling by 1e500 I could do additive scaling with logs. I'll have to think that through.


      Comment


      • #4
        One problem you will run into before any computation is that the double precision number closest to 0 is 2^-1074, which is around 4.94*10^(-324), hence in double precision, 1e-500 can not be distinguished from 0.

        Code:
          . mata:
        ------------------------------------------------- mata (type end to exit) ------
        : a = 1e - 324
        
        : b = (a > 0)
        
        : b
          0
        
        : a = 1e - 325
        
        : b = (a > 0)
        
        : b
          0
        
        : a = 1e - 500
        
        : b = (a > 0)
        
        : b
          0
        
        : a = 1e - 323
        
        : b = (a > 0)
        
        : b
          1
        
        : end
        --------------------------------------------------------------------------------
        .

        Comment


        • #5
          This is strange. In Stata I get:
          Code:
          . scalar  a = exp(-500)
          . scalar  b = a>0
          . scalar list b
                   b =          1
          Last edited by Steve Samuels; 22 Mar 2016, 09:25.
          Steve Samuels
          Statistical Consulting
          [email protected]

          Stata 14.2

          Comment


          • #6
            Not so strange: Stata cannot compute exp(-500)

            Code:
             scalar a = exp(-500)
             scalar list a
                   a =  7.12e-218

            Steve Samuels
            Statistical Consulting
            [email protected]

            Stata 14.2

            Comment


            • #7
              There is no problem and Stata computes exp(-500) fine. 1e-500 is 10^(-500), not exp(-500).

              Code:
              . di 1e-2
              .01

              Comment


              • #8
                What an embarassing mistake! Thank you for the correction, Hua.
                Last edited by Steve Samuels; 23 Mar 2016, 10:40.
                Steve Samuels
                Statistical Consulting
                [email protected]

                Stata 14.2

                Comment


                • #9
                  I have actually run into this problem in estimation problems that involve integration. My solution is somewhat along the lines of what Stephen suggests. To consider an example, suppose one wishes to calculate exp(-1000)/(exp(-1000)+exp(-1001)). If one does this in Mata, i.e.:

                  Code:
                  . mata
                  -------------------------- mata (type end to exit) -------------------------------
                  : exp(-1000)/(exp(-1000)+exp(-1001))
                  .
                  One finds that it doesn't compute, even though the answer is easy to get by multiplying through numerator and denominator by exp(1000) to give:
                  Code:
                  . mata
                  -------------------------- mata (type end to exit) -------------------------------
                  : 1/(1+exp(-1))
                     .7310585786
                  What I have found is a good way of attacking the general problem of computing sums, or fractions, involving really small exponential terms, is to:

                  1. work in logs
                  2. normalize any sums of exponential terms by subtracting the maximum term in the sum from each term:

                  An example
                  Code:
                  mata:
                  : x=-(1..10)*1000
                  : x=-1000,x
                  : p=exp(x[1])/sum(exp(x))
                  : p
                      .
                  This doesn't work, because terms are all zero in the denominator. However if we compute in logs with normalization - replacing sum(exp(x)) with M+ln(sum(exp(x-M)), all is well:
                  Code:
                  : M=max(x)
                  : lnp=x[1]-M-ln(sum(exp(x:-M))
                  : lnp
                     -.6931471806
                  : exp(lnp)
                      .5
                  The idea is to simply replace sums of exponents sum(exp(x)) with sum(exp(x-m))exp(m) or ln(exp(x-m))+m, so there is always a one in the log function!

                  Hope that helps,

                  Matt Baker


                  Comment


                  • #10
                    Thanks, something like this sounds promising. There are some complicating details in my problem, so I'll have to see if that can work for me.

                    Regards, Mike

                    Comment

                    Working...
                    X