Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • What will be the syntax if there are multiple conditions of the same category?

    Hi everyone
    Sorry for asking fundamental questions but I am a new user. What will be the syntax if there are multiple conditions of the same category? For example, I want to categorize Education_level 1, 2, and 3 as low_education? How can I do that? I used
    gen education_3cat =.
    replace education_3cat = 1 if Education_level 1 | Education_level 2 | Education_level 3

    I also used "&" but I couldn't do that? Could you please help me to do that?

  • #2
    The closest thing to what you tried that would be legal syntax is:
    Code:
    replace education_3cat = 1 if Education_level== 1 | Education_level == 2 | Education_level == 3
    Using & instead of | would produce incorrect results It is never possible for Education_level == 1 & Education_level == 2 & Education_level == 3 to be true: the value of Education_level can be 1, 2, or 3, but it can only be one of them, not all three at once.

    But your code can be improved in two ways. First, you should not create yes/no variables like this using 1 for yes and missing value for no. That is a recipe for serious errors when you use these variables later. You should always code the no as 0, never anything else. Reserve missing value for situations where the true value is unknowable or not available. Second, the lengthy construction with | and == can be abbreviated with the -inlist()- function. Finally, use logical expressions: there is no need to use two commands, one having an -if- qualifier. So, the best way to create this variable is:
    Code:
    gen education_3cat = inlist(Education_level, 1, 2, 3)
    Short, simple, and transparent!

    Sorry for asking fundamental questions but I am a new user.
    No reason to apologize for this. New users and fundamental questions are welcome here. We were all new users at one time and were asking similar questions.

    Comment


    • #3
      Thanks a lot for this big help

      Comment


      • #4
        The solution in #2 assumes that there are no missing values in "Education_level". But if there are missing values, "education_3cat" will be coded as "0" (valid) which may not be what you want. For example:
        Code:
        . sysuse auto, clear
        (1978 automobile data)
        
        . tab1 rep78, mi
        
        -> tabulation of rep78  
        
             Repair |
        record 1978 |      Freq.     Percent        Cum.
        ------------+-----------------------------------
                  1 |          2        2.70        2.70
                  2 |          8       10.81       13.51
                  3 |         30       40.54       54.05
                  4 |         18       24.32       78.38
                  5 |         11       14.86       93.24
                  . |          5        6.76      100.00
        ------------+-----------------------------------
              Total |         74      100.00
        
        . gen replev = inlist(rep78,1,2,3)
        
        . tab2 rep78 replev, mi
        
        -> tabulation of rep78 by replev  
        
            Repair |
            record |        replev
              1978 |         0          1 |     Total
        -----------+----------------------+----------
                 1 |         0          2 |         2
                 2 |         0          8 |         8
                 3 |         0         30 |        30
                 4 |        18          0 |        18
                 5 |        11          0 |        11
                 . |         5          0 |         5
        -----------+----------------------+----------
             Total |        34         40 |        74
        
        . drop replev
        
        . gen replev = inlist(rep78,1,2,3) if rep78 < .
        (5 missing values generated)
        
        . tab2 rep78 replev, mi
        
        -> tabulation of rep78 by replev  
        
            Repair |
            record |              replev
              1978 |         0          1          . |     Total
        -----------+---------------------------------+----------
                 1 |         0          2          0 |         2
                 2 |         0          8          0 |         8
                 3 |         0         30          0 |        30
                 4 |        18          0          0 |        18
                 5 |        11          0          0 |        11
                 . |         0          0          5 |         5
        -----------+---------------------------------+----------
             Total |        29         40          5 |        74

        Comment


        • #5
          Thank you so much for your help

          Comment

          Working...
          X