How to find the value of a variable using condition on another variable?

Shreya Dutt

Join Date: Dec 2019
Posts: 34

How to find the value of a variable using condition on another variable?

27 Jul 2020, 22:50

Hi Statalist users,

I have a dataset of which I am posting a snippet with relevant variables (age, quarterly date, trend, unemployment) below:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input long age float(dt_survey_q trend unrate)
18 72 125 6.333333
19 72 125 6.333333
20 72 125 6.333333
21 72 125 6.333333
22 72 125 6.333333
23 72 125 6.333333
24 72 125 6.333333
25 72 125 6.333333
26 72 125 6.333333
27 72 125 6.333333
28 72 125 6.333333
29 72 125 6.333333
30 72 125 6.333333
31 72 125 6.333333
32 72 125 6.333333
33 72 125 6.333333
34 72 125 6.333333
35 72 125 6.333333
36 72 125 6.333333
37 72 125 6.333333
38 72 125 6.333333
39 72 125 6.333333
40 72 125 6.333333
41 72 125 6.333333
42 72 125 6.333333
43 72 125 6.333333
44 72 125 6.333333
45 72 125 6.333333
46 72 125 6.333333
47 72 125 6.333333
48 72 125 6.333333
49 72 125 6.333333
50 72 125 6.333333
51 72 125 6.333333
52 72 125 6.333333
53 72 125 6.333333
54 72 125 6.333333
55 72 125 6.333333
56 72 125 6.333333
57 72 125 6.333333
58 72 125 6.333333
59 72 125 6.333333
60 72 125 6.333333
61 72 125 6.333333
62 72 125 6.333333
63 72 125 6.333333
64 72 125 6.333333
65 72 125 6.333333
66 72 125 6.333333
67 72 125 6.333333
68 72 125 6.333333
69 72 125 6.333333
70 72 125 6.333333
71 72 125 6.333333
72 72 125 6.333333
73 72 125 6.333333
74 72 125 6.333333
75 72 125 6.333333
76 72 125 6.333333
77 72 125 6.333333
78 72 125 6.333333
79 72 125 6.333333
80 72 125 6.333333
81 72 125 6.333333
82 72 125 6.333333
83 72 125 6.333333
84 72 125 6.333333
85 72 125 6.333333
86 72 125 6.333333
87 72 125 6.333333
88 72 125 6.333333
90 72 125 6.333333
91 72 125 6.333333
95 72 125 6.333333
 . 72 125 6.333333
18 73 126        6
19 73 126        6
20 73 126        6
21 73 126        6
22 73 126        6
23 73 126        6
24 73 126        6
25 73 126        6
26 73 126        6
27 73 126        6
28 73 126        6
29 73 126        6
30 73 126        6
31 73 126        6
32 73 126        6
33 73 126        6
34 73 126        6
35 73 126        6
36 73 126        6
37 73 126        6
38 73 126        6
39 73 126        6
40 73 126        6
41 73 126        6
42 73 126        6
end
format %tq dt_survey_q

I am generating a new variable according to a complicated formula part of which needs me to to be able to do the below:
i. Find the value of unemployment (unrate) corresponding to a particular value of trend
ii. Find the value of dt_survey_q according to a particular value of trend
Essentially, I want to be able to find the index corresponding to a particular value of trend and then find the value of unemployment, dt_sruvey_q according to that index. On python, I would do something as below. I don't know how to do this on Stata. If not Stata, can I use Mata? If so, how would I do that?

Python code:

Code:

index = data[data['trend']== x].index.tolist()[0]
            y = data['unrate'][index]

Tags: index, panel data, syntax

Joro Kolev

Join Date: Aug 2018

Posts: 3050
#2

28 Jul 2020, 00:43

Checkout the user written -vlookup-, from within Stata type

Code:

findit vlookup

and then follow the instructions.

If this does not do what you want, be more particular about what you want to achieve.
Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17714

28 Jul 2020, 00:44

Shreya:
do you mean something along the following lines?

Code:

. bysort trend: list unrate dt_survey_q trend if _n==1

---------------------------------------------------------------------------------------------------------------------------------------
-> trend = 125

     +-----------------------------+
     |   unrate   dt_sur~q   trend |
     |-----------------------------|
  1. | 6.333333     1978q1     125 |
     +-----------------------------+

---------------------------------------------------------------------------------------------------------------------------------------
-> trend = 126

     +---------------------------+
     | unrate   dt_sur~q   trend |
     |---------------------------|
  1. |      6     1978q2     126 |
     +---------------------------+

. bysort trend: gen counter=1 if _n==1


. list if counter==1

     +---------------------------------------------+
     | age   dt_sur~q   trend     unrate   counter |
     |---------------------------------------------|
  1. |  18     1978q1     125   6.333333         1 |
 76. |  18     1978q2     126          6         1 |
     +---------------------------------------------+

.

Kind regards,
Carlo
(Stata 19.0)

Comment

Shreya Dutt

Join Date: Dec 2019

Posts: 34
#4

28 Jul 2020, 14:19

Dear Joro and Carlo,

I tried to use vlookup as follows:

Code:

local x unrate gen `x'key = 125 vlookup `x'key, gen(l1x) key(trend) value(`x')

I was expecting to have l1x = 6.3333 populated but instead I got back the following error:
unrate is unique within trend;
there are multiple observations with different unrate within trend.

I am not sure I understand the error: is there an error because there are duplicate entries for unrate? How do I solve this issue?

Last edited by Shreya Dutt; 28 Jul 2020, 14:22.
Comment
Shreya Dutt

Join Date: Dec 2019

Posts: 34
#5

28 Jul 2020, 15:30

William Lisowski provided a solution to this issue in: https://www.statalist.org/forums/for...ror-in-vlookup
I used that to get around my problem.
Thank you all!
1 like
Comment
Joro Kolev

Join Date: Aug 2018

Posts: 3050
#6

28 Jul 2020, 15:36

I have no idea what you re trying to do, so I am just firing shots in the dark here, but the following

Code:

. gen unratekey = 125 . vlookup unratekey, gen(l1x) key(trend) value( unrate )

goes through the values of variable unratekey, which in this case are constant at 125, then, for this values of 125 creates a new variable having the unrate corresponding to the value of trend equal to unratekey, which in this case is 125.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30122
#7

28 Jul 2020, 19:52

I'm also uncertain what you are trying to do, but maybe the following does it:

Code:

rangestat (mean) unrate dt_survey_q, interval(trend `x'key `x'key)

Note: -rangestat- is written by Robert Picard, Nick Cox, and Roberto Ferrer, and is available from SSC.

That said, the problem as stated is ill-posed. You refer to "the" value of unrate or dt_survey_q for a given value of trend. But there are many observations for every value of trend in your data. And while it is true that in the example data you show, for any value of trend, all the corresponding observations of unrate or dt_survey_q are the same. But if that doesn't hold in all of your data, then there is no solution to your problem as any of the various values of unrate or dt_survey_q that co-appear with the keyed value of trend would be a candidate and you have given no rule for selecting among them. (The code I've shown here will give you the mean of all such values.) If you are confident that unrate and dt_survey_q are always the same for all observations with a given value of trend, then I suggest you have the code verify that beforehand:

Code:

by trend (unrate), sort: assert unrate[1] == unrate[_N] by trend (dt_survey_q), sort: assert dt_survey_q[1] = dt_survey_q[_N]

There is an additional problem you may stumble over. Your variable trend is a float. Now, in the example data, the values of trend are all integers. But if that is not true throughout the data, some of your lookups may fail because exact matching of floating-point numbers cannot be relied on.

Last edited by Clyde Schechter; 28 Jul 2020, 20:05.
Comment

Announcement

How to find the value of a variable using condition on another variable?

Comment

Comment

Comment

Comment

Comment

Comment