Vlookup with no exact match

Louise Collis

Join Date: Oct 2015

Posts: 3
#1

Vlookup with no exact match

25 Jan 2017, 21:23

I am trying to identify the point at which a cumulative distribution exceeds a value (for a set of 11 cutoff values I am interested in).
I have data on how many goods were sold at each price point that looks something like (but with 15000 price points and 50 goods)::
price Volvo Toyota Ford cumpct_Volvo cumpct_Toyota cumpct_Ford
21 1745 892 569 2.0893964 .0145218 11.392102
22 1325 786 1256 3.3054851 .01475554 12.579115
23 765 543 1408 4.2243122 .01521645 14.298504
24 2100 120 786 5.266155 .02214319 16.581708
25 1389 111 1632 6.2830139 .02457282 18.804259

The columns 2-4 are quantities, and the columns 5-7 are cumulative percentages of total volume sold.
I need to return the price which accounts for 5% or 10% etc of the quantities, so in the above, if I was looking for 5% for Volvo, I want a return of price=24.
First I tried installing vlookup, but it will only do exact matches. So I tried rounding down (floor) my cumulative percentages, but then it said that they were not unique.
(key_percent was the list of percent cutoffs I am interested in.)
. vlookup key_percent , gen(Volvo_cutoff) key(cumpct_Volvo) value(price)
price is unique within cumpct_Volvo;
there are multiple observations with different price within cumpct_Volvo.

Is there a way that I can return the first instance of a number greater than 5%, for example?
Or is there a way to manipulate the data so that i can just summarize to get out the numbers I need?
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30154
#2

25 Jan 2017, 22:07

How about this:

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input byte price int(Volvo Toyota Ford) double(cumpct_Volvo cumpct_Toyota cumpct_Ford) 21 1745 892 569 2.0893964 .0145218 11.392102 22 1325 786 1256 3.3054851 .01475554 12.579115 23 765 543 1408 4.2243122 .01521645 14.298504 24 2100 120 786 5.266155 .02214319 16.581708 25 1389 111 1632 6.2830139 .02457282 18.804259 end sort cumpct_Volvo summ price if cumpct_Volvo >= 5 & cumpct_Volvo[_n-1] < 5

Note: Evidently you can embed this code in loops over the percentage threshold or the brand of car.

In the future, please do not post the kind of data table you used in #1. It took me much longer to wrestle that data into Stata than it did to solve your problem. Please do as I have done here, and use the -dataex- command to show example data. You can install -dataex- by running -ssc install dataex-. Read -help dataex- for the simple instructions for using it. By using -dataex- you make it possible for anyone who wants to test some code on your example data to create a completely faithful replica of your data with just a simple copy/paste operation. Be helpful to those who want to be helpful to you.
1 like
Comment
Louise Collis

Join Date: Oct 2015

Posts: 3
#3

25 Jan 2017, 22:33

Thanks Clyde. That was perfect.
I need to turn them into a table, but I think I can google how to use the return list.
I've installed dataex for next time.
Comment

Announcement

Vlookup with no exact match

Comment

Comment