Creating dummy variable based on percentiles

Wali Ullah

Join Date: Mar 2019

Posts: 51
#1

Creating dummy variable based on percentiles

15 Aug 2019, 00:41

Hi Everyone,

I have a variable G-Index with the following distribution:

Governance |
Index |
(Gompers, |
Ishii, |
Metrick) | Freq. Percent Cum.
------------+-----------------------------------
1 | 1 0.02 0.02
2 | 9 0.14 0.16
3 | 55 0.87 1.02
4 | 161 2.53 3.56
5 | 331 5.21 8.77
6 | 549 8.64 17.41
7 | 738 11.61 29.02
8 | 857 13.49 42.51
9 | 917 14.43 56.94
10 | 803 12.64 69.58
11 | 698 10.99 80.56
12 | 521 8.20 88.76
13 | 396 6.23 95.00
14 | 189 2.97 97.97
15 | 98 1.54 99.51
16 | 20 0.31 99.83
17 | 6 0.09 99.92
18 | 4 0.06 99.98
19 | 1 0.02 100.00
------------+-----------------------------------
Total | 6,354 100.00

I am trying to create a variable treat, which is equal to 0 if the value of G-index is in the top 25% percentile and 1 if its in the bottom 75%. Can anyone help me with the proper codes for that?

Thanks!
Tags: None
Nick Cox

Join Date: Mar 2014

Posts: 35724
#2

15 Aug 2019, 02:01

I don't know what the variable name as G-index is not a legal name. Regardless, note that something like

Code:

gen wanted = gindex <= 10 if gindex < ,

will generate 1 in 70% of your observations (for which you have non-missing values) while

Code:

gen wanted = gindex <= 11 if gindex < .

will generate 1 in about 81%. Necessarily those are the only choices if the aim is 75%. But why do this at all?
Comment
Wali Ullah

Join Date: Mar 2019

Posts: 51
#3

19 Aug 2019, 22:10

Nick Cox the index is named gindex (lower values indicate better governed firms and higher indicate the opposite). I need to create this dummy as a treatment variable in my study. I need to take the top 75-80% of the companies based on gindex as treated (dummy=1) and the bottom 20-25% as control (dummy=0). Really appreciate your assistance with the coding.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35724
#4

20 Aug 2019, 00:45

So, your question is answered in #2. If you're asking a new question, I don't understand what it is.
Comment
Wali Ullah

Join Date: Mar 2019

Posts: 51
#5

20 Aug 2019, 00:55

Nick Cox Thank you. Got your answer.

I got the gindex figures for 1994, 1996, 1998 etc. Data for the years in between i.e. 1995, 1997 is missing. If I want to fill up the corresponding year with the same data, what should be the coding?

For example, 1994's data should be autofilled in 1995, 1996's data autofilled in 1997 etc,
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35724
#6

20 Aug 2019, 01:29

To me that sounds a very bad idea. The idea that things changed, and then did not change, in this way, is utterly implausible. This is a recipe for a failed grade or a rejected paper, assuming competent teachers or reviewers. The data you have are the data you should analyse.
Comment
Wali Ullah

Join Date: Mar 2019

Posts: 51
#7

20 Aug 2019, 02:36

Nick Cox this is the way this data works. Its calculated once every 2 years, and researchers use it in the way I mentioned earlier. G-index is a measure of firm’s governance mechanisn, and it doesn’t change every year.

Really require your help with the coding.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35724
#8

20 Aug 2019, 02:51

Ok; thanks for the explanation. But there is utterly no point in getting exactly the same result twice. You don't need help in coding here. The result for 1995 is by fiat the result for 1994. Why make Stata do the same work twice over?
Comment
Wali Ullah

Join Date: Mar 2019

Posts: 51
#9

20 Aug 2019, 04:52

Nick Cox the dataset I got only gives the data in this way
1994
1996
1998

which is why I need stata to fill up 1995 with 1994's data and so on.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35724
#10

20 Aug 2019, 04:55

Sorry, but I have nothing to add to my previous comments. Giving you code to do something that appears silly at best is not in your best interests. If anyone else thinks differently, they will respond.
Comment

Announcement

Creating dummy variable based on percentiles

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment