Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Sequence of Values

    How to create a new variable called TEST that equals the value 1 (in all lines) whenever the sequence 1-2-1 occurs in the lines of variable VAR_B for each group of PAR_PESS and DT_DATE, if the interval between DT_DATE is greater or equal to 1, calculated from variables with value 1 in VAR_B (and create a variable named DIFF with the time difference value):

    Data example:
    PAR_PESS VAR_B DT_DATE
    1 1 01/01/23
    1 2 02/01/23
    1 1 03/01/23
    1 1 04/01/23
    1 2 05/01/23
    1 1 06/01/23
    2 2 01/01/23
    2 1 02/01/23
    2 1 03/01/23
    2 2 04/01/23
    2 1 05/01/23
    3 1 02/01/23
    3 2 03/01/23
    3 1 04/01/23
    3 2 05/01/23
    4 1 01/01/23
    4 2 02/01/23
    4 2 03/01/23
    4 1 04/01/23
    4 2 05/01/23
    4 2 06/01/23
    4 1 07/01/23
    4 2 08/01/23
    4 1 09/01/23


    I would like it to stay that way:
    PAR_PESS VAR_B DT_DATE TEST DIFF
    1 1 01/01/23 1 .
    1 2 02/01/23 1 .
    1 1 03/01/23 1 2
    1 1 04/01/23 . .
    1 2 05/01/23 1 .
    1 1 06/01/23 1 3
    2 2 01/01/23 . .
    2 1 02/01/23 1 .
    2 1 04/01/23 . .
    2 2 04/01/23 1 .
    2 1 05/01/23 1 3
    3 1 02/01/23 1 .
    3 2 03/01/23 1 .
    3 1 04/01/23 1 2
    3 2 05/01/23 . .
    4 1 01/01/23 1 .
    4 2 02/01/23 1 .
    4 2 03/01/23 . .
    4 1 04/01/23 1 3
    4 2 04/01/23 . .
    4 2 06/01/23 1 .
    4 1 07/01/23 1 3
    4 2 08/01/23 1 .
    4 1 09/01/23 1 2


  • #2
    This is mostly clear but we need please a data example using dataex to make clear how you are holding your dates,

    as a string variable

    as a numeric variable with value labels

    as a numeric variable with a date display format.

    See FAQ Advice #12 for more details.

    Comment


    • #3
      Originally posted by Nick Cox View Post
      This is mostly clear but we need please a data example using dataex to make clear how you are holding your dates,

      as a string variable

      as a numeric variable with value labels

      as a numeric variable with a date display format.

      See FAQ Advice #12 for more details.

      Code:
      clear
      input float par_pess byte var_b float(dt_date)
      1 1 23011
      1 2 23012
      1 1 23013
      1 1 23014
      1 2 23015
      1 1 23016
      2 2 23011
      2 1 23012
      2 1 23013
      2 2 23014
      2 1 23015
      3 1 23012
      3 2 23013
      3 1 23014
      3 2 23015
      4 1 23011
      4 2 23012
      4 2 23013
      4 1 23014
      4 2 23015
      4 2 23016
      4 1 23017
      4 2 23018
      4 1 23019
      end
      format %dD/N/CY dt_date

      Comment


      • #4
        Sorry, I thought I understood this but now don't think I do at all. You have values of TEST that are 1 that don't match triples of 1 2 1.

        Comment


        • #5
          My guess is that the wanted sequence is 1-2-1-2-1..., where the total number of tagged observations is greater or equal to three and an odd number. The complication here is the existence of duplicates.

          2 1 04/01/23 . .
          2 2 04/01/23 1 .
          4 1 04/01/23 1 3
          4 2 04/01/23 . .
          Therefore, the date variable does not uniquely identify the sequence of observations. The code below assumes that observations with the same combination of par_press and dt_date are no more than two. It also codes the wanted variable 0/1 instead of missing/1.

          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input byte(var_b par_pess) float dt_date byte(test diff)
          1 1 23011 1 .
          2 1 23042 1 .
          1 1 23070 1 2
          1 1 23101 . .
          2 1 23131 1 .
          1 1 23162 1 3
          2 2 23011 . .
          1 2 23042 1 .
          1 2 23101 . .
          2 2 23101 1 .
          1 2 23131 1 3
          1 3 23042 1 .
          2 3 23070 1 .
          1 3 23101 1 2
          2 3 23131 . .
          1 4 23011 1 .
          2 4 23042 1 .
          2 4 23070 . .
          1 4 23101 1 3
          2 4 23101 . .
          2 4 23162 1 .
          1 4 23192 1 3
          2 4 23223 1 .
          1 4 23254 1 2
          end
          format %tdN/D/Y dt_date
          
          gen long seq=_n
          bys par_pess (seq): gen tag1= sum(var_b==1)>=1 & var_b ==1
          bys par_pess (seq): gen tag2= var_b ==2 & sum(tag1)>=1
          egen tag3= concat(tag1 tag2)
          bys par_pess (dt_date seq): replace tag3="" if  dt_date== dt_date[_n+1] & var_b== var_b[_n-1]
          bys par_pess dt_date (seq): replace tag3="" if _n>1 &  !missing(tag3[1])
          bys par_pess (seq): gen wanted= cond(tag3==tag3[_n-1]|  tag3=="00"| missing(tag3), 0, 1)
          by par_pess: egen tot= total(wanted)
          by par_pess: replace tot= cond(tot<3, 0,cond(!mod(tot, 2), tot-1, tot))
          by par_pess: replace wanted= !wanted if sum(wanted)> tot
          drop seq tag? tot
          Res.:

          Code:
          . l, sepby(par_pess)
          
               +----------------------------------------------------+
               | var_b   par_pess    dt_date   test   diff   wanted |
               |----------------------------------------------------|
            1. |     1          1   01/01/23      1      .        1 |
            2. |     2          1   02/01/23      1      .        1 |
            3. |     1          1   03/01/23      1      2        1 |
            4. |     1          1   04/01/23      .      .        0 |
            5. |     2          1   05/01/23      1      .        1 |
            6. |     1          1   06/01/23      1      3        1 |
               |----------------------------------------------------|
            7. |     2          2   01/01/23      .      .        0 |
            8. |     1          2   02/01/23      1      .        1 |
            9. |     1          2   04/01/23      .      .        0 |
           10. |     2          2   04/01/23      1      .        1 |
           11. |     1          2   05/01/23      1      3        1 |
               |----------------------------------------------------|
           12. |     1          3   02/01/23      1      .        1 |
           13. |     2          3   03/01/23      1      .        1 |
           14. |     1          3   04/01/23      1      2        1 |
           15. |     2          3   05/01/23      .      .        0 |
               |----------------------------------------------------|
           16. |     1          4   01/01/23      1      .        1 |
           17. |     2          4   02/01/23      1      .        1 |
           18. |     2          4   03/01/23      .      .        0 |
           19. |     1          4   04/01/23      1      3        1 |
           20. |     2          4   04/01/23      .      .        0 |
           21. |     2          4   06/01/23      1      .        1 |
           22. |     1          4   07/01/23      1      3        1 |
           23. |     2          4   08/01/23      1      .        1 |
           24. |     1          4   09/01/23      1      2        1 |
               +----------------------------------------------------+
          
          .
          Last edited by Andrew Musau; 10 Mar 2023, 17:04.

          Comment

          Working...
          X