Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to calculate means and variances of the underlying distribution from percentile values?

    Dear all, I am new to Statalist but need some help.

    I have data on persons doing something X days per month in categories (corresponding to "<=1", "2-3", "4", or "8+" days per month) and I need to estimate the unobserved central tendency (mean and median) and dispersion of the underlying distribution in different subgroups of the data. My initial idea was to use the information about X as percentile values (i.e. in one subgroup, almost 42% has value "<=1" and almost 22% has value "2-3", so the 42nd percentile value is 2 and the 64th percentile value is 4) and calculate the mean and variance by assuming a lognormal underlying distribution.

    I therefore used sigma=(ln(value1)-ln(value2))/(z1-z2) and mu=ln(value1)-sigma*z1, where value1 and value2 are the percentile values (2 and 4) and z1 and z2 are the zscores corresponding to p=.42 and p=.64 from the standard normal cdf. Calculations by hand returned sigma=1.3 and mu=0.95 for this particular subgroup of the data.

    My problem is that (1) I am not convinced that the underlying distribution is lognormal (someone suggested that the Weibull or Gamma distribution is more appropriate) and (2) I want to use all the available data (e.g. all three percentile values in the example above) when I estimate the underlying distribution.

    Is there a simple way in Stata to fit different (e.g. Lognormal, Weibull, or Gamma) distributions from information about (three or more) percentile values alone, and test which of the distributions that gives the best fit?

    Best,
    Ståle
    Last edited by Ståle Østhus; 06 Feb 2015, 03:38.
Working...
X