Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generate a variable with Conditions

    I am very new to stata and it is a struggle figuring out the commands. I am trying to generate a variable we can call "YESOR" by taking data from three already created variables (IRY, EWY, CPY). These variables are coded 1 for yes and 2 for no. So, I need to generate a variable that shows if the observations responded yes for any of the three variables. I also need to create a variable that includes observations that responded yes for all three (I would call this "YESAND"). Any help is appreciated.

  • #2
    you have not supplied a data example but it appears that you want the -egen- command with, in particular one of the first 3 functions listed in the help file for your first goal and the fourth function (concat) for you second goal; so, first, please read the FAQ and then see
    Code:
    h egen
    sorry, missed your last sentence the first time - again, the -egen- command and the rowtotal function (but that would call for a second step getting to an indicator variable

    finally, almost certainly, analysis will be easier and more informative if you have 0 for "no" - use -replace- for that
    Last edited by Rich Goldstein; 16 Oct 2023, 14:24.

    Comment


    • #3
      Berliner?

      Here are some details adding to Rich Goldstein's explanation. You only need one method that works, so follow your taste.

      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input float(IRY EWY CPY)
      1 1 1
      1 1 2
      1 2 1
      1 2 2
      2 1 1
      2 1 2
      2 2 1
      2 2 2
      end
      
      gen ANY1 = min(IRY, EWY, CPY) == 1
      
      gen ANY2 = (IRY == 1) | (EWY == 1) | (CPY == 1)
      
      egen ANY3 = anymatch(IRY EWY CPY), values(1)
      
      gen ANY4 = 0
      
      foreach v in IRY EWY CPY {
          replace ANY4 = 1 if `v' == 1
      }
      
      list, sep(0)
      
           +---------------------------------------------+
           | IRY   EWY   CPY   ANY1   ANY2   ANY3   ANY4 |
           |---------------------------------------------|
        1. |   1     1     1      1      1      1      1 |
        2. |   1     1     2      1      1      1      1 |
        3. |   1     2     1      1      1      1      1 |
        4. |   1     2     2      1      1      1      1 |
        5. |   2     1     1      1      1      1      1 |
        6. |   2     1     2      1      1      1      1 |
        7. |   2     2     1      1      1      1      1 |
        8. |   2     2     2      0      0      0      0 |
           +---------------------------------------------+


      As Rich says, (0, 1) indicators are more useful than (1, 2) indicators. The tutorial at https://journals.sagepub.com/doi/pdf...36867X19830921 is standard stuff, but that's the main point.

      Yes for all three is a variation on methods 1, 2, and 4.
      Last edited by Nick Cox; 16 Oct 2023, 14:42.

      Comment


      • #4
        Here are solutions for all variables being 1.

        Solution 1: if all values are 1, their maximum must be 1 too.

        Solution 2: use & (logical AND)

        Solution 3: look for any 2; if we find any, it isn't true that all are 1.

        Solution 4: similar to 3: assume all are 1, but if we find any 2, we change our mind.

        Notice that with just 8 possibilities it is easy and a good idea to test with a small dataset constructed for the purpose.

        For more on true and false in Stata https://www.stata.com/support/faqs/d...rue-and-false/
        https://journals.sagepub.com/doi/pdf...867X1601600117



        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input float(IRY EWY CPY)
        1 1 1
        1 1 2
        1 2 1
        1 2 2
        2 1 1
        2 1 2
        2 2 1
        2 2 2
        end
        
        gen ALL1 = min(IRY, EWY, CPY) == 1 
        
        gen ALL2 = (IRY == 1) & (EWY == 1) & (CPY == 1)
        
        egen ALL3 = anymatch(IRY EWY CPY), values(2)
        replace ALL3 = 1 - ALL3 
        
        gen ALL4 = 1
        
        foreach v in IRY EWY CPY { 
            replace ALL4 = 0 if `v' == 2 
        }
        
        list, sep(0)
        
         
            +---------------------------------------------+
             | IRY   EWY   CPY   ALL1   ALL2   ALL3   ALL4 |
             |---------------------------------------------|
          1. |   1     1     1      1      1      1      1 |
          2. |   1     1     2      1      0      0      0 |
          3. |   1     2     1      1      0      0      0 |
          4. |   1     2     2      1      0      0      0 |
          5. |   2     1     1      1      0      0      0 |
          6. |   2     1     2      1      0      0      0 |
          7. |   2     2     1      1      0      0      0 |
          8. |   2     2     2      0      0      0      0 |
             +---------------------------------------------+

        Comment


        • #5
          Oops. In #4 it should be

          Code:
          gen ALL1 = max(IRY, EWY, CPY) == 1
          My own list shows that the previous version was wrong!

          Comment

          Working...
          X