Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to estimate the exponential decay parameter

    Hi Listers,

    My data code GP attendance per month (I have 24 months worth of data). Plotting the data shows an exponential pattern. I would like to estimate the growth parameter from the data so I can estimate cumulative rates at various months using the CDF formula.

    I read in previous threads, one way to estimate the growth parameter is to log transform the Y (number of visits) before running a standard regression.

    g ln_visit = ln(visit)

    regress ln_visit month

    Is the regression coefficient for time the actual growth factor? To estimate the CFD at 6 months, I would need to specify: 1-exp(-btime*6), is this correct?

    Would the approach be identical if the curve showed a decay pattern rather than growth?

    Thanks in advance!

  • #2
    Is the regression coefficient for time the actual growth factor? To estimate the CFD at 6 months, I would need to specify: 1-exp(-btime*6), is this correct?

    Would the approach be identical if the curve showed a decay pattern rather than growth?
    Yes, the regression coefficient is the actual growth parameter. To estimate the predicted number of visits at 6 months, you would calculated exp(btime*6). If it were exponential decay everything works the same way: the only difference is that the regression coefficient (and growth parameter) will be negative.

    Comment


    • #3
      Thanks Clyde, this is really helpful.

      Could you confirm it is OK to enter number of visits after being log transformed but not time? Feel free to point me to some references if it is easier.

      Lastly, could I check that I could use 1-exp(-btime*6) to know how many visits have happened at 6 - cumulatively?

      Many thanks!

      Comment


      • #4
        Hi Clyde,

        On closer inspection, the btime coefficient is negative: -0.12 which suggests a decay behavior of the data.

        Using your formula, I can calculate a range of predicted visit in various months and they show greater numbers in the earlier months compared to later months. The exp(btime*6) gives me a rate rather than an actual number - is this correct?

        Using the 1-exp(-(-0.12)*months) in an attempt to estimate the cumulative predicted rate, all values are negative. I am obviously doing something wrong but not sure what.

        Any advice is greatly appreciated! Thanks again
        Last edited by Laura Myles; 18 Sep 2020, 04:01.

        Comment


        • #5
          Let's get some clarity on the model you are trying to fit. Your data consists of a time variable and a count of visits variable. The count of visits variable could be either the cumulative total up to that time, or it could be the number of visits at that particular time. If either one is growing exponentially, so is the other.

          If you are expecting an exponential growth in visits starting from time 0, the model is N = N0(exp b*t), where t is time, N0 is the number of visits as of time 0, and b is the growth rate. If you take the logarithms of both sides, you get log N = log N0 + b*t, which corresponds to a linear regression of log N with t. The coefficient of t will be the estimate of b, and the constant term will be the estimate of log N0. If there is growth, b will be positive. If there is decay, b will be negative. Once you have the coefficient and constant term in hand, you can then use the model formula with those values. Alternatively, if you just want to apply it to the data, you can use the -predict- command to get predicted values of log N, and then you can exponentiate those to get predictions of N.

          In any case exp(btime*6) is neither a rate nor a number of visits. The number of visits is N0*exp(btime*6); and the rate is just btime.

          Why are you using the formula 1-exp(-(-0.12)*months)? It is not a correct formula for calculating the number of visits in a given month. It is a reminiscent of formula that is germane to an exponential survival model, and would represent the fraction of an inception cohort that is still alive at that number of months if the instantaneous mortality rate is 0.12/month, except that the correct formula would not have two minus signs inside the exp().

          Comment


          • #6
            Hi Clyde,

            Thank you for such an in-depth answer. The idea behind this work was to identify the cumulative rate from the study to identify the rate of attendance (among those who attend)attend so we can apply it to the general population. We expect the study is a representative.

            The count in my visit variable is for visit per month so, from your email. It seems I need to change it to cumulative visits over months.

            I can use regress ln_cumvisits time

            To estimate the rate (btime)

            I would then like to use the estimated rate on the N population to identify how many people have been (cumulatively) screened at each month. Assuming an exponential distribution, can I then just:

            log N = log Npop + b*t
            ​​​​​​*for actual N
            N = exp(logN)

            I originally thought I could use rate (btime) in the formula for the exponential cumulative density function. I was under the impression this would tell me how many have attended their visit? The reason why I have two negative signs is because the estimated rate (btime) is -0.12.

            Thanks again!

            Comment


            • #7
              Assuming an exponential distribution
              [emphasis added]
              The longer this thread goes on, the more confusing it gets. I thought, up to this point we were talking about the visits following an exponential growth or decay process. (An exponential decay process is just an exponential growth process with a negative coefficient.) That is not at all the same thing as the number of visits following an exponential distribution. Or maybe you are thinking of a process where the inter-visit arrival time is exponentially distributed? Or what? I think you are using language imprecisely and it is getting you in trouble.

              So let's go back to the beginning. Let's define exactly what your context and goals are. I think it would also be helpful if you showed some example data: just seeing how the data look could be helpful in figuring out what you are dealing with here. Be sure to use -dataex- to show your example data. If you are running version 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

              Comment


              • #8
                Hi Clyde,

                Apologies if my posts are confusing. From the observational data, I would like to estimate the rate of attendance (in attenders) so I could apply to the population to estimate the number of individuals who would have had a visit by 3, 6 and 9 months. For this, I assume I need the cumulative total of visits over months.

                When I plot the cumulative total over months, the figure suggests the data have an exponential distribution. So I planned to run:

                g ln_cumulative = ln(cumulative)
                regress ln_cumulative month

                The bmonth will be the rate but I am unsure on the next steps.

                I am attaching a subset of the data for info:

                Code:
                * Example generated by -dataex-. To install: ssc install dataex
                clear
                input float cumulative byte month
                1126.02  0
                3165.63  1
                4334.61  2
                5028.51  3
                 5590.8  4
                6189.06  5
                 6589.2  6
                6886.98  7
                7190.76  8
                7436.31  9
                7622.07 10
                7774.89 11
                7903.98 12
                8020.47 13
                8116.86 14
                8206.68 15
                end
                Thanks again for your help.

                Comment


                • #9
                  If you graph cumulative vs month with a log-scale on the vertical axis and an untransformed horizontal axis what you see is definitely curved, whereas an exponential growth process would show a straight line (more or less). So this is not an exponential growth process. If you graph the same data, excluding month 0, using a log scale on both axes you get something very close to a straight line--which says that what you have here is a power-law, although the data point at month 0 does not quite fit in. Anyway, nothing good will come of forcing an exponential model onto this data: a power law seems to be governing here.

                  I also do not understand your desire to estimate the results for months 3, 6, and 9 since you already have the actual data for those times.

                  Comment

                  Working...
                  X