I am trying to identify the point at which a cumulative distribution exceeds a value (for a set of 11 cutoff values I am interested in).
I have data on how many goods were sold at each price point that looks something like (but with 15000 price points and 50 goods)::
price Volvo Toyota Ford cumpct_Volvo cumpct_Toyota cumpct_Ford
21 1745 892 569 2.0893964 .0145218 11.392102
22 1325 786 1256 3.3054851 .01475554 12.579115
23 765 543 1408 4.2243122 .01521645 14.298504
24 2100 120 786 5.266155 .02214319 16.581708
25 1389 111 1632 6.2830139 .02457282 18.804259
The columns 2-4 are quantities, and the columns 5-7 are cumulative percentages of total volume sold.
I need to return the price which accounts for 5% or 10% etc of the quantities, so in the above, if I was looking for 5% for Volvo, I want a return of price=24.
First I tried installing vlookup, but it will only do exact matches. So I tried rounding down (floor) my cumulative percentages, but then it said that they were not unique.
(key_percent was the list of percent cutoffs I am interested in.)
. vlookup key_percent , gen(Volvo_cutoff) key(cumpct_Volvo) value(price)
price is unique within cumpct_Volvo;
there are multiple observations with different price within cumpct_Volvo.
Is there a way that I can return the first instance of a number greater than 5%, for example?
Or is there a way to manipulate the data so that i can just summarize to get out the numbers I need?
I have data on how many goods were sold at each price point that looks something like (but with 15000 price points and 50 goods)::
price Volvo Toyota Ford cumpct_Volvo cumpct_Toyota cumpct_Ford
21 1745 892 569 2.0893964 .0145218 11.392102
22 1325 786 1256 3.3054851 .01475554 12.579115
23 765 543 1408 4.2243122 .01521645 14.298504
24 2100 120 786 5.266155 .02214319 16.581708
25 1389 111 1632 6.2830139 .02457282 18.804259
The columns 2-4 are quantities, and the columns 5-7 are cumulative percentages of total volume sold.
I need to return the price which accounts for 5% or 10% etc of the quantities, so in the above, if I was looking for 5% for Volvo, I want a return of price=24.
First I tried installing vlookup, but it will only do exact matches. So I tried rounding down (floor) my cumulative percentages, but then it said that they were not unique.
(key_percent was the list of percent cutoffs I am interested in.)
. vlookup key_percent , gen(Volvo_cutoff) key(cumpct_Volvo) value(price)
price is unique within cumpct_Volvo;
there are multiple observations with different price within cumpct_Volvo.
Is there a way that I can return the first instance of a number greater than 5%, for example?
Or is there a way to manipulate the data so that i can just summarize to get out the numbers I need?
Comment