Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Dummy variable that changes per group of observations

    Hello everyone,

    I have two dummy variables ger1 and ger2.

    Essentially what I need to do is to create another binary variable (named here dummy) in which, for every "group" of observations in which ger1=1, if it has at least one observation where ger2=1, then dummy=1 for the entire group.

    Sorry if this is confusing, I include a data sample hoping it helps clarify this a bit.
    ("dummy" in this case, was done by hand, it is what I want it to look like - and what I need your help with to learn how to code it).

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(ger1 ger2 dummy)
    0 0 0
    0 0 0
    1 0 0
    1 0 0
    1 0 0
    0 0 0
    0 0 0
    0 0 0
    0 0 0
    0 0 0
    0 0 0
    0 0 0
    0 0 0
    1 0 1
    1 0 1
    1 0 1
    1 0 1
    1 1 1
    1 1 1
    0 0 0
    0 0 0
    1 0 0
    1 0 0
    0 0 0
    end
    I'm using Stata 14.1 on Mac.
    If anyone has any insight on how to do this, it would be much appreciated.

    Thanks in advance,
    Ana
    Last edited by Ana Vargas; 09 Sep 2020, 04:07. Reason: dummy variables

  • #2
    This all hinges on identifying groups automatically. As I understand it a group for you is in other terms a spell or run of values of 1 for ger1. That doesn't really make sense to me unless there is something like a time variable underlying what you show. None is evident, but I created a pseudo-time variable which allows me to reach for tsspell from SSC as a convenience tool.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(ger1 ger2 dummy)
    0 0 0
    0 0 0
    1 0 0
    1 0 0
    1 0 0
    0 0 0
    0 0 0
    0 0 0
    0 0 0
    0 0 0
    0 0 0
    0 0 0
    0 0 0
    1 0 1
    1 0 1
    1 0 1
    1 0 1
    1 1 1
    1 1 1
    0 0 0
    0 0 0
    1 0 0
    1 0 0
    0 0 0
    end
    
    gen long t = _n
    tsset t
    ssc install tsspell
    tsspell, cond(ger1 == 1) spell(group)
    
    list, sepby(group)
    
         +------------------------------------------------+
         | ger1   ger2   dummy    t   _seq   group   _end |
         |------------------------------------------------|
      1. |    0      0       0    1      0       0      0 |
      2. |    0      0       0    2      0       0      0 |
         |------------------------------------------------|
      3. |    1      0       0    3      1       1      0 |
      4. |    1      0       0    4      2       1      0 |
      5. |    1      0       0    5      3       1      1 |
         |------------------------------------------------|
      6. |    0      0       0    6      0       0      0 |
      7. |    0      0       0    7      0       0      0 |
      8. |    0      0       0    8      0       0      0 |
      9. |    0      0       0    9      0       0      0 |
     10. |    0      0       0   10      0       0      0 |
     11. |    0      0       0   11      0       0      0 |
     12. |    0      0       0   12      0       0      0 |
     13. |    0      0       0   13      0       0      0 |
         |------------------------------------------------|
     14. |    1      0       1   14      1       2      0 |
     15. |    1      0       1   15      2       2      0 |
     16. |    1      0       1   16      3       2      0 |
     17. |    1      0       1   17      4       2      0 |
     18. |    1      1       1   18      5       2      0 |
     19. |    1      1       1   19      6       2      1 |
         |------------------------------------------------|
     20. |    0      0       0   20      0       0      0 |
     21. |    0      0       0   21      0       0      0 |
         |------------------------------------------------|
     22. |    1      0       0   22      1       3      0 |
     23. |    1      0       0   23      2       3      1 |
         |------------------------------------------------|
     24. |    0      0       0   24      0       0      0 |
         +------------------------------------------------+
    
    .
    Now what you want is immediate as

    Code:
    egen wanted = max(ger2), by(group)

    or

    Code:
    egen wanted = max(ger2) if group > 0, by(group)

    Note that you must install tsspell before you can use it. Its use for defining spells is not compulsory and far from the only solution. One fairly detailed tutorial at https://www.stata-journal.com/articl...article=dm0029 does not even mention it.

    It is easy enough to extend this to panel or longitudinal datasets, which requires only an appropriate
    tsset.

    Comment


    • #3
      Thank you so much, this was exactly what I needed. Much appreciated!

      Comment


      • #4
        Hello,

        I have a further question on this:

        Imagine that I want now that at least 2 (or 3 or a given number higher than 1, really) observations of ger2 must be equal to 1 in order for me to consider a spell of data.

        This way I can no longer use:

        Code:
        egen wanted = max(ger2), by(group)
        as Nick suggested.

        Does anyone have any suggestions?

        Comment


        • #5

          Code:
          egen wanted = total(ger2), by(group)
          replace wanted = wanted > 42
          for any value of 42.

          If the variable in question could anything other than 0 or 1 or missing, then you want

          Code:
          egen wanted = total(ger2 == 1), by(group)

          Comment


          • #6
            Perfect, thank you!

            Comment


            • #7
              I'm now trying something more complicated and I'm having some difficulties.

              Suppose I have a third variable ger3, also a binary variable, and now I want to run a group (potentially using tsspell as well) that contains all observations with ger1=1, if at least one observation where ger2=1 and that ends when ger3=1 if ger=1 is at the end of that group.

              Basically I want a spell that adds one observation if ger3=1 at the end of the spell already defined by:

              Code:
               tsspell, cond(ger1 == 1) spell(group)  
               egen wanted = max(ger2), by(group) end
              I provide an example of my data, where dummy is essentially what I want to achieve for the entire group.

              Code:
              * Example generated by -dataex-. To install: ssc install dataex
              clear
              input float(ger1 ger2 ger3 dummy)
              0 0 1 .
              0 0 1 .
              0 0 1 .
              0 0 1 .
              0 0 1 .
              1 0 0 .
              0 0 1 .
              1 0 0 1
              1 1 0 1
              1 1 0 1
              1 0 0 1
              1 0 0 1
              1 0 0 1
              1 0 0 1
              1 0 0 1
              1 0 0 1
              0 0 1 1
              1 0 0 .
              1 0 0 .
              1 0 0 .
              1 0 0 .
              1 0 0 .
              0 0 1 .
              0 0 1 .
              0 0 1 .
              0 0 1 .
              0 0 1 .
              1 0 0 .
              1 0 0 .
              0 0 1 .
              end
              Thank you in advance.

              Comment

              Working...
              X