Suppose someone took a sample from a large population and gave me the following sample statistics on heights:
I want to know Pr(A <= Height <= B). Is there a way to use all this information in the calculation?
The simplest thing I can think of is doing this:
However, that does not use all available information since it ignores all percentile data.
The solution can be fairly slow.
Code:
scalar n = 5510 scalar mean = 161.3 scalar se = 0.19 scalar p5 = 149.8 scalar p10 = 152.5 scalar p15 = 153.9 scalar p25 = 156.4 scalar p50 = 161.3 scalar p75 = 166.0 scalar p85 = 168.4 scalar p90 = 170.2 scalar p95 = 172.5
The simplest thing I can think of is doing this:
Code:
scalar sd = scalar(se) * sqrt(scalar(n)) capture program drop pr_calc program define pr_calc, // rclass syntax, MEAN(real) SD(real) A(real) B(real) P(string) scalar `p' = normal((`b' - `mean')/`sd') - normal((`a' - `mean')/`sd') di "Pr(`a' <= X <= `b') = " `p' end pr_calc, mean(`=scalar(mean)') sd(`=scalar(sd)') a(150) b(170) p(pr)
The solution can be fairly slow.
Comment