Kernel Density Estimation Quantile/percentile

Emin Oz

Join Date: Mar 2018

Posts: 2
#1

Kernel Density Estimation Quantile/percentile

08 Mar 2018, 19:23

Hello Stata Users,

I used Kernel Density (kdensity) to gather relevant distribution for my variable of interest. With kdensity, I created kernel density variable. However, I am interested in finding quantiles/percentiles from kernel density estimations.
How can I do that?
On the other hand, when I drew the chart of the densities over time ( x:quarters y: densities), it was similar to chart that I had in kdensity. I am confused. Do you have any idea about it?
Thank you.
Tags: None
Maarten Buis

Join Date: Mar 2014

Posts: 3458
#2

09 Mar 2018, 01:54

It really depends on what you did exactly. Since you did not tell us that, there is little we can say.

On a more general note: why would you first smooth a distribution and than compute quantiles? The raw PDF is typically too erratic, which is why we have smoothers like the kernel density estimators. But quantiles are based on the CDF, and those typically don't require smoothing.

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
1 like
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35711
#3

09 Mar 2018, 03:25

I agree with Maarten Buis , and I'd go further.

Kernel estimation of density functions is a great method. I've used it, written on it, and so forth. But it's not a good way to go here. Whatever estimate you get will depend on choices you made:

1. kernel type.chosen

2. kernel width chosen

3. whether you estimated on a sensible scale (for example, often density estimation is much better done on a transformed scale)

4. whether you thought about boundary conditions (for example, did the estimation routine respects bounds such as zero or 100% when they apply? that's not guaranteed without special code)

Using default choices from e.g. kdensity is not an answer here: they're not attempts to be very smart about the data, nor can they apply what you know about the variable.

I add as a footnote that there are methods for getting quantiles with smoothing in a different sense. See e.g. hdquantile from SSC.
Comment
Emin Oz

Join Date: Mar 2018

Posts: 2
#4

09 Mar 2018, 07:20

Thank you for the answers! In my study, I have 10 variables. From these variables, I try to create one variable. It is kind of aggregate variable. First, I standardized all the variables. Then, estimated the distribution of each variable using Kernel estimation because original distribution is erratic as you mentioned. After this step, I want to transform my variables on (0,1) range based on their quantiles. In order to make aggregation, I need standard intervals.
The paper below used the same steps ( pg: 16-17).
https://www.federalreserve.gov/econr...2015059pap.pdf

I tried many Kernel types. They are not very differentiating each other. Finally, I decided Gaussian. In terms of bandwidth, I do not have information. I always used default option. I tried many bandwidth, but default one also provided enough smoothing.

In data window of stata, kernel densities do not match with its original data. In other words, densities has the same shape over time (like bell shape). It is the distribution shape, not the movement over time.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35711
#5

09 Mar 2018, 08:48

Sorry, but I am not familiar with that paper. If you want more details on replicating what the authors did, you might be best advised to ask the authors.
Comment

Announcement

Kernel Density Estimation Quantile/percentile

Comment

Comment

Comment

Comment