Generating categorical dummy variable

Frank Marshall

Join Date: Mar 2019

Posts: 4
#1

Generating categorical dummy variable

30 Mar 2019, 06:29

Hi all,

I need some help with this.

I would like to generate a new binomial variable EDUCATION equal to 1 if HIGH SCHOOL (existing variable) is above the median value and 0 otherwise

Thankful for any help.

Frank
Tags: logit
Rich Goldstein

Join Date: Mar 2014

Posts: 4494
#2

30 Mar 2019, 08:36

"HIGH SCHOOL" cannot be the name of a Stata variable (spaces not allowed) so I use "hs" instead:

Code:

egen median=median(hs) gen byte EDUCATION=hs>=median

you did not say what you wanted to do with variables that were actually at the median; I, arbitrarily, put it into the "higher" group

if hs is ever missing you will want to also:

Code:

replace EDUCATION=. if hs==.

you probably also want to attach labels to this new variable; see

Code:

help label
2 likes
Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17743

30 Mar 2019, 08:37

Frank:
welcome to this forum.
The following toy-example may help:

Code:

set obs 10
g HIGH_SCHOOL=runiform()*10
quietly sum HIGH_SCHOOL, d
g EDUCATION=1 if HIGH_SCHOOL > r(p50)
replace EDUCATION=0 if HIGH_SCHOOL <=r(p50)
. list

     +---------------------+
     | HIGH_S~L   EDUCAT~N |
     |---------------------|
  1. | 3.488717          1 |
  2. | 2.668857          0 |
  3. | 1.366463          0 |
  4. | .2855687          0 |
  5. | 8.689333          1 |
     |---------------------|
  6. | 3.508549          1 |
  7. | .7110509          0 |
  8. |  3.23368          0 |
  9. | 5.551032          1 |
 10. | 8.759911          1 |
     +---------------------+

PS: crossed in the cyberspace with Rich's more efficient code.

Kind regards,
Carlo
(Stata 19.0)

Comment

Frank Marshall

Join Date: Mar 2019

Posts: 4
#4

30 Mar 2019, 12:32

Thanks Carlo and Rich. I've found your responses very helpful
Comment
Frank Marshall

Join Date: Mar 2019

Posts: 4
#5

31 Mar 2019, 04:07

Sorry, I need some help again!

I have run a logistic regression model on a UK wide dataset comprising variables education age sex and family. How can I re-estimate the same model on a sub-set of the dataset i.e. respondents in London only?

Thanks in advance!
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35793
#6

31 Mar 2019, 04:33

The answers in #2 and #3 already show that you can select a subset of observations using if. So type

Code:

help if

to find out how to use that qualifier.

(This is really a different question, so in future please start a new thread if your question is different, i.e. no longer matches the thread title.)
1 like
Comment
Frank Marshall

Join Date: Mar 2019

Posts: 4
#7

31 Mar 2019, 05:44

Thanks Nick!
Comment

Announcement

Generating categorical dummy variable

Comment

Comment

Comment

Comment

Comment

Comment