using gammap and gammafit to estimate moments based probability and cumulative density functions

mintewab bezabih

Join Date: Aug 2020

Posts: 7
#1

using gammap and gammafit to estimate moments based probability and cumulative density functions

01 Aug 2020, 16:38

Hi all, I have been trying to estimate household-and-period-specific conditional well-being probability density function and associated complementary cumulative distribution function (ccdf) assuming a gamma distribution and using predicted conditional moment estimates (mean and variance)
I tried the following two approaches: gammap and gammafit. Gammap just gives me a distribution with value 1 or very close to 1 for all my observations (a flat shapped gamma distribution) and I could not find sufficient examples to execute gammafit.

I appreciate your help on the matter
cheers
minti
Tags: None
Stephen Jenkins

Join Date: Apr 2014

Posts: 1435
#2

02 Aug 2020, 04:24

I could not find sufficient examples to execute gammafit

I don't understand what you mean by this statement. (I am co-author of -gammafit-, with Nick Cox. The program is on SSC, as you should tell us. As a new member -- welcome! But, please, do read the Forum FAQ about how to post and post effectively.)

If you have data to which you can apply -gammafit- and derive estimates of the model parameters (whether as functions of covariates or not), you should be able to use the formulae for the Gamma pdf and/or cdf to derive estimates of those objects and related ones such as moments.
1 like
Comment
mintewab bezabih

Join Date: Aug 2020

Posts: 7
#3

02 Aug 2020, 05:31

Thanks stephen. I have now followed your suggestion, the FAQ suggestions and below is a reproducible example for gammafit. What I do not know is, does this really look like a gamma distribution? Is this a PDF? how do I do the CDF? How do I do this if I wanted to try a normal distribution?

cheers
minti

use http://fmwww.bc.edu/RePEc/bocode/o/oaxaca.dta, clear
reg lnwage educ exper tenure female age agesq

******generating the moments (condititional mean and variance)
predict pwage
egen mean_wage=mean(pwage)
gen variance_wage=((mean_wage-lnwage)^2)/1000000
gen skewness_wage=((mean_wage-lnwage)^3)/1000000

******generating the moments (condititional mean and variance)

gen alpha=variance_wage/mean_wage

gen sigma=(mean_wage*mean_wage)/variance_wage
gen delta=sigma/1000000000

************GAMMAFIT****************************** ******
gen gammafit= gammap(alpha, delta)

plot gammafit lnwage
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35721
#4

02 Aug 2020, 05:44

It's evident from this that gammafit is just the name of one of your variables, and nothing to do with the command of the same name on SSC which Stephen mentioned.

Your mean_wage variable will contain the mean, but your variance_wage and skewness_wage variables won't contain the variance and skewness. There are several reasons for that, but use summarize instead. Or, indeed, use gammafit instead,

Your analysis seems to be that log of wage should have a gamma distribution, so a log gamma distribution. Perhaps there is a literature on that.
Comment
mintewab bezabih

Join Date: Aug 2020

Posts: 7
#5

02 Aug 2020, 05:46

oh sorry on my earlier post, I mixed up gammafit and gammap .... Here is a repost:

Thanks stephen. I have now followed your suggestion, the FAQ suggestions and below is a reproducible example for gammafit and gammap. For gammafit, I was able to generate parameters but did not know how to put the parameters I generated (from conditional mean and variance). For gammap, what I do not know is, does this really look like a gamma distribution? Is this a PDF? how do I do the CDF? How do I do this if I wanted to try a normal distribution?

cheers
minti

use http://fmwww.bc.edu/RePEc/bocode/o/oaxaca.dta, clear
reg lnwage educ exper tenure female age agesq

******generating the moments (condititional mean and variance)
predict pwage
egen mean_wage=mean(pwage)
gen variance_wage=((mean_wage-lnwage)^2)/1000000
gen skewness_wage=((mean_wage-lnwage)^3)/1000000

******generating the moments (condititional mean and variance)

gen alpha=variance_wage/mean_wage

gen sigma=(mean_wage*mean_wage)/variance_wage
gen delta=sigma/1000000000

************GAMMAFIT****************************** ******
gammafit lnwage

************GAMMAP******************************** ****
gen gamma= gammap(alpha, delta)
plot gamma lnwage
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35721
#6

02 Aug 2020, 05:51

#5 It's in the help. Estimates are accessible as e-class results.

Saved results

In addition to the usual results saved after ml, gammafit also saves the following, if no covariates have been
specified:

e(alpha) and e(beta) are the estimated gamma parameters.

The following results are saved regardless of whether covariates have been specified:

e(b_alpha) and e(b_beta) are row vectors containing the parameter estimates from each equation.

e(length_b_alpha) and e(length_b_beta) contain the lengths of these vectors. If no covariates are specified in an
equation, the corresponding vector has length equal to 1 (the constant term); otherwise, the length is one plus
the number of covariates.
Comment
mintewab bezabih

Join Date: Aug 2020

Posts: 7
#7

02 Aug 2020, 05:54

THanks Nick. I saw your post after reposting it... you are right, as you could see in my repost, I corrected the gamma fit and i now have both gammap and gammafit. The conditional mean and variance are calculated based on moments approach to compute so they are acutally first and second order moments of lnwage.

What I acutally want to get at is estimate a pdf and ccdf of a distribution (gamma or even normal) based on the parameters I calculate which are themselves based on the conditional mean and variance. Let me repost the code again:
thanks
minti

use http://fmwww.bc.edu/RePEc/bocode/o/oaxaca.dta, clear
reg lnwage educ exper tenure female age agesq

******generating the moments (condititional mean and variance)
predict pwage
egen mean_wage=mean(pwage)
gen variance_wage=((mean_wage-lnwage)^2)/1000000
gen skewness_wage=((mean_wage-lnwage)^3)/1000000

******generating the moments (condititional mean and variance)

gen alpha=variance_wage/mean_wage

gen sigma=(mean_wage*mean_wage)/variance_wage
gen delta=sigma/1000000000

************GAMMAFIT****************************** ******
gammafit lnwage

************GAMMAP******************************** ****
gen gamma= gammap(alpha, delta)
plot gamma lnwage
Comment
mintewab bezabih

Join Date: Aug 2020

Posts: 7
#8

02 Aug 2020, 07:37

Hi Nick,
Thanks. I generated the alpha and beta parameters (the way I want them I think) using gammafit (see below). Now how do I generate the pdf and ccdf, (that is what I want eventually-to create a dummy variable based on the ccdf and a cutoff wage)?
thanks
minti

use http://fmwww.bc.edu/RePEc/bocode/o/oaxaca.dta, clear

******generating the moments (condititional mean and variance)

reg lnwage educ exper tenure female age agesq
predict pwage
egen mean_wage=mean(pwage)
gen variance_wage=((mean_wage-lnwage)^2)/1000000
gen skewness_wage=((mean_wage-lnwage)^3)/1000000

******generating the moments (condititional mean and variance)

gen alpha=variance_wage/mean_wage

gen sigma=(mean_wage*mean_wage)/variance_wage

************GAMMAFIT****************************** ******
gammafit lnwage, alphavar(alpha ) betavar(sigma)
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35721
#9

02 Aug 2020, 07:44

As said already, your variance and skewness variables are not going to contain correct measures of variance and skewness. To see that, note that if you look at those variables in the Data Editor, you don't see constants. Also, you've not matched the algebra of the definitions with your code, Also, the divisors of 1 million: where do they come from?

If you want to use method of moments estimators, get the moments you need from summarize.
Comment
mintewab bezabih

Join Date: Aug 2020

Posts: 7
#10

02 Aug 2020, 07:59

Hi Nick, it is not moments of the variable I am after, it is the conditional moments of the wage function (based on the article I attach below- I do not mean for you to read it, I just attached it for your references. So I am pretty confident I have the right conditional mean and variance. Of course I do not know if the way I am going about generating the pdf and the ccdf is right-if indeed I am in the right path, given these two variables.
thanks
minti

(based on this https://www.afdb.org/fileadmin/uploa...d_Approach.pdf)
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35721
#11

02 Aug 2020, 08:16

Sorry, I don't think I can help you further. In so far as I understand it, your approach seems very confused, as I have tried to explain. Other way round, if it is me who is confused, then i can't help for that reason.
Comment
mintewab bezabih

Join Date: Aug 2020

Posts: 7
#12

02 Aug 2020, 12:48

It was a helpful set of tips Nick-thanks. I will have to think through the problem and will get back with clearer queries
chhers
minti
Comment

Stephen Jenkins

Join Date: Apr 2014
Posts: 1435

#13

02 Aug 2020, 15:46

I agree with Nick about the continuing confusion. Please do read the Forum FAQ, as previously advised, regarding how to ask questions effectively. The article you attached covers many things. You'd do yourself a favour if you were much more precise about (a) precisely which equations of that paper set out the model specification that you are interested in; (b) precisely which expressions specify the post-estimation objects you are interested in; and (c) the nature of your data set. (It doesn't help readers help you -- trying to work out what you want to do -- if you write that you don't want them to read the article! Moreover (and as the FAQ advises), providing example data using -dataex- is advisable as well.
You appear to be confusing methods of estimation (e.g. ML versus methods of moments) and post-estimation statistics such as conditional moments.

Observe below the sort of thing that you can do with gammafit: (And note how it corresponds to the model description here, including the formulae for logarithmic expectation and variance.)

Code:

. sysuse auto , clear
(1978 Automobile Data)

. gammafit mpg, alphavar(price) betavar(price) nolog

ML fit of two-parameter gamma distribution        Number of obs   =         74
                                                  Wald chi2(1)    =       0.80
Log likelihood = -218.77089                       Prob > chi2     =     0.3723

------------------------------------------------------------------------------
         mpg |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
alpha        |
       price |   -.000407   .0004562    -0.89   0.372    -.0013012    .0004872
       _cons |   22.45311   4.653251     4.83   0.000      13.3329    31.57331
-------------+----------------------------------------------------------------
beta         |
       price |  -.0000245    .000023    -1.06   0.288    -.0000696    .0000207
       _cons |   1.214208   .2472893     4.91   0.000     .7295303    1.698886
------------------------------------------------------------------------------

. ereturn list

scalars:
                 e(rc) =  0
                 e(ll) =  -218.7708922119871
          e(converged) =  1
               e(rank) =  4
                  e(k) =  4
               e(k_eq) =  2
               e(k_dv) =  0
                 e(ic) =  10
                  e(N) =  74
         e(k_eq_model) =  1
               e(df_m) =  1
               e(chi2) =  .7958130414527679
                  e(p) =  .3723481569894777
      e(length_b_beta) =  2
     e(length_b_alpha) =  2

macros:
             e(depvar) : "mpg"
                e(cmd) : "gammafit"
            e(predict) : "ml_p"
           e(chi2type) : "Wald"
                e(vce) : "oim"
                e(opt) : "ml"
              e(title) : "ML fit of two-parameter gamma distribution"
          e(ml_method) : "lf"
               e(user) : "gammafit_lf"
          e(technique) : "nr"
         e(properties) : "b V"

matrices:
                  e(b) :  1 x 4
                  e(V) :  4 x 4
            e(b_alpha) :  1 x 2
             e(b_beta) :  1 x 2
           e(gradient) :  1 x 4
               e(ilog) :  1 x 20
              e(ml_hn) :  1 x 2
              e(ml_tn) :  1 x 2

functions:
             e(sample)   

. matrix list e(b)

e(b)[1,4]
         alpha:      alpha:       beta:       beta:
         price       _cons       price       _cons
y1    -.000407   22.453105  -.00002447   1.2142084

From these estimates, one could calculate logarithmic means and variances that are conditional on 'X' (price in this example

Announcement

using gammap and gammafit to estimate moments based probability and cumulative density functions

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment