xtile question

Jocelyn Cherry

Join Date: Jun 2015

Posts: 47
#1

xtile question

21 Jul 2016, 14:00

I need to split some contrast vision readings on a group of children into the worst quintile versus the rest.
I had been using this:
xtile f7bstcntrst = f7vs204c, nq(5)
bysort f7bstcntrst : sum f7vs204c if f7vs204c!=.
sum f7vs204c if f7bstcntrst!=. & f7bstcntrst!=1
sum f7vs204c if f7bstcntrst!=. & f7bstcntrst==1
replace f7bstcntrst=0 if f7bstcntrst!=1
replace f7bstcntrst=. if f7vs204c==.
tab f7bstcntrst
However the problem with this comes in the first line - xtile is splitting my data into the quintiles of the contrast sensitivity logarithmic value, not into the children with the worst reading [smallest values] versus the rest of the children, which is what I want. How do I tell stata to make the quintile based on the 1/5 worst children's readings, not the 1/5 of the range of values possibly obtained?
If my question doesn't make sense, here are some pictures explaining my output and explaining what the contrast sensitivity readings mean.

[From Normal values for the Pelli-Robson contrast sensitivity test Maija Ma¨ntyja¨rvi, MD, Tarja Laitinen, MD]
Tags: None
Jocelyn Cherry

Join Date: Jun 2015

Posts: 47
#2

21 Jul 2016, 14:04

[The varname is f7bstcntrst even though I want to look at the worst 1/5 of children because f7vs204c is the reading for their best eye.]
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30115
#3

21 Jul 2016, 15:32

Well, it isn't possible in your data. 20% of 7,103 is 1420.6. So you want to separate out the lowest scoring 1420. But as you can see, looking at scores up through 1.5 you don't even come close to that number. Then there is this huge clump at 1.65: all of whom must be grouped together--you cannot assign some of them to one group and others to another in a rational way. And once you get past 1.65 there are only 173 left, who form the top "quintile." Bottom line: your data are too clumped to do what you are asking.
Comment
Jocelyn Cherry

Join Date: Jun 2015

Posts: 47
#4

22 Jul 2016, 12:00

Thank you so much....that is the problem then! So if I wanted to separate off the bottom ~400, or how do I change my Stata to do it? Do I use cutpoints?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30115
#5

22 Jul 2016, 12:19

You could do this:

Code:

sort f7vs204c local threshold = f7vs204c[400] gen byte select = (f7vs204c < `threshold')

This code identifies the 400th observation when the data are sorted on f7vs204c. It then marks as selected those whose values is less than that. This will avoid breaking up a group of identical observations if the 400th happens to lie within such a group. In the data you show, the result will be to select everybody with f7vs204c at 1.5 or lower, which is a bit fewer than 400 observations.

As an aside, I strongly encourage you to rename your variables to names with mnemonic value (unless these names actually have mnemonic value for you). Think about it: if you have to go back to this code a year from now in response to a journal reviewer, won't you waste a lot of time revisiting which variable here is which and what they mean. For that matter, if there were a mistake in your code where you used a wrong but similar variable name, would you be able to spot it quickly?
1 like
Comment
Jocelyn Cherry

Join Date: Jun 2015

Posts: 47
#6

22 Jul 2016, 12:28

Thank you Clyde, that is a very good point.
Comment

Announcement

Comment

Comment

Comment

Comment

Comment