Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generate dummy variable with OR condition

    Hi everyone,

    I am trying to create a dummy variable with an OR condition.

    I have the following variable:

    company_location_id_1
    1
    5
    3
    6
    2
    etc

    I need the dummy to be equal 1 if location is 1 or 2.

    I tried:
    generate Silicon_Valley = 1 if company_location_id_1 == company_location_id_1 == 1 | company_location_id_1 ==2 |
    I also tried:
    generate Silicon_Valley = 1 if company_location_id_1 == (1|2)

    But in both cases all observations get a value of 1, while I need everything else to be 0.

    Thank you in advance,

    Cristiano

  • #2
    Code:
    clear
    
    cls
    
    input byte company_location_id_1
    1
    5
    3
    6
    2
    end
    
    g indicator = 1 if inlist(co,1,2)
    
    br
    Or if you REALLY wanna generalize this
    Code:
    clear
    
    cls
    
    input byte company_location_id_1
    1
    5
    3
    6
    2
    end
    g indicator  = cond(inlist(co, 1,2), 1, 0)
    Last edited by Jared Greathouse; 28 Jun 2022, 07:56.

    Comment


    • #3
      Code:
      generate Silicon_Valley = 1 if company_location_id_1 == 1 | company_location_id_1 ==2
      It should work.

      Comment


      • #4
        the code above will give a 1/missing variable and is not what is asked for; instead of that in #2, try
        Code:
        gen indicator=inlist(co,1,2)
        re: the two sets of code in #1, the second set is clearly illegal; the first set, except for the last character looks right (except for my first point above) but since you didn't show us how Stata responded or what was wrong with it, I can't clearly say what you should have done instead (maybe my first point is the problem?)

        in the future, please post as advised in the FAQ

        Comment


        • #5
          Rich's code in 4 does the same as mine (the one I added in after the fact), but simpler, and therefore should be the one used.

          Comment


          • #6
            Hi everyone, thanks for all the replies. The code has worked wonderfully. I copy it for future uses:

            g indicator = cond(inlist(company_location_id_1, 1,2,3,9,10,11,12,13,14,15,16,17,18,19,21,22,28,29, 30), 1, 0)

            Thank you,
            Cristiano
            Last edited by Cristiano Bellavitis; 28 Jun 2022, 08:18.

            Comment


            • #7
              This is tangential to the question, but I was interested in OP's claim that
              Code:
              generate Silicon_Valley = 1 if company_location_id_1 == (1|2)
              did not result in an error.

              I confirmed that the code does not produce an error (although it certainly doesn't do what OP intended); from what I can ascertain it is equivalent to
              Code:
              generate Silicon_Valley = 1 if company_location_id_1 == 1
              because, as I understand it, any non-zero numerical argument to a Boolean expression evaluates to TRUE. This means 1|2 evaluates to TRUE|TRUE, which is TRUE so evaluates to 1.

              Am I correct that this is how Stata is interpreting that code?

              Comment


              • #8
                Cristiano Bellavitis whatis the full list of location ids? And yes, that's correct Paul Dickman

                Comment


                • #9
                  #7 Paul Dickman That's right in spirit. But -- pedantically or otherwise -- it is perhaps best grasped slightly differently, as unlike many languages Stata doesn't have explicit logical or Boolean types or TRUE or FALSE as distinct values under those or any other names. .

                  Stata does yield 1 or 0 as numerical results of logical expressions, and so. 2|1 evaluates to 1.

                  #6 Jared Greathouse


                  Code:
                  g indicator = cond(inlist(company_location_id_1, 1,2,3,9,10,11,12,13,14,15,16,17,18,19,21,22,28,29, 30), 1, 0)
                  is simpler as

                  Code:
                  g indicator = inlist(company_location_id_1, 1,2,3,9,10,11,12,13,14,15,16,17,18,19,21,22,28,29, 30)
                  and could also be rewritten as a combination of inrange() calls. The more interesting point is that inlist() yields 1 or 0 as logical result so pushing it through cond() to do the same is harmless but needless.

                  Comment


                  • #10
                    Nick Cox Yep I agree that not using cond is simpler.

                    I also agree that the inrange point (now that I know the full problem) is better.

                    That's why I asked how many total IDs there were. If this were the full list of them,
                    Code:
                    g indicator = inrange(co, 1,30) & !inrange(co,4,8)
                    is much more compact

                    Comment


                    • #11
                      Jared Greathouse You also need to exclude 20 and 23 to 27.

                      Comment

                      Working...
                      X