Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generating categorical dummy variable

    Hi all,

    I need some help with this.

    I would like to generate a new binomial variable EDUCATION equal to 1 if HIGH SCHOOL (existing variable) is above the median value and 0 otherwise

    Thankful for any help.

    Frank

  • #2
    "HIGH SCHOOL" cannot be the name of a Stata variable (spaces not allowed) so I use "hs" instead:
    Code:
    egen median=median(hs)
    gen byte EDUCATION=hs>=median
    you did not say what you wanted to do with variables that were actually at the median; I, arbitrarily, put it into the "higher" group

    if hs is ever missing you will want to also:
    Code:
    replace EDUCATION=. if hs==.
    you probably also want to attach labels to this new variable; see
    Code:
    help label

    Comment


    • #3
      Frank:
      welcome to this forum.
      The following toy-example may help:
      Code:
      set obs 10
      g HIGH_SCHOOL=runiform()*10
      quietly sum HIGH_SCHOOL, d
      g EDUCATION=1 if HIGH_SCHOOL > r(p50)
      replace EDUCATION=0 if HIGH_SCHOOL <=r(p50)
      . list
      
           +---------------------+
           | HIGH_S~L   EDUCAT~N |
           |---------------------|
        1. | 3.488717          1 |
        2. | 2.668857          0 |
        3. | 1.366463          0 |
        4. | .2855687          0 |
        5. | 8.689333          1 |
           |---------------------|
        6. | 3.508549          1 |
        7. | .7110509          0 |
        8. |  3.23368          0 |
        9. | 5.551032          1 |
       10. | 8.759911          1 |
           +---------------------+
      PS: crossed in the cyberspace with Rich's more efficient code.
      Kind regards,
      Carlo
      (Stata 18.0 SE)

      Comment


      • #4
        Thanks Carlo and Rich. I've found your responses very helpful

        Comment


        • #5
          Sorry, I need some help again!

          I have run a logistic regression model on a UK wide dataset comprising variables education age sex and family. How can I re-estimate the same model on a sub-set of the dataset i.e. respondents in London only?

          Thanks in advance!

          Comment


          • #6
            The answers in #2 and #3 already show that you can select a subset of observations using if. So type

            Code:
            help if
            to find out how to use that qualifier.

            (This is really a different question, so in future please start a new thread if your question is different, i.e. no longer matches the thread title.)

            Comment


            • #7
              Thanks Nick!

              Comment

              Working...
              X