Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Replace based on conditions in following observations

    Hi all,
    This feels like a problem of both logic or code, and probably both.
    I'm trying to replace a generated variable (Sal_1) by conditionally selecting based in part on data from other observations.

    Here's the data from dataex. This data includes the variable Sal_1 which I've replaced so far.
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte Crtria1 float(Crtria2 Var3) long Salary float Sal_1
    1  0  0 59945 59945
    1 12  0 62745     .
    1 24  0 69095     .
    2  0  7 60445 60445
    2 12  7 63236     .
    2 24  7 69595     .
    3  0 14 60945 60945
    3 12 14 63745 63745
    3 24 14 70095     .
    4  0 21 61445 61445
    4 12 21 64245 64245
    4 24 21 70595     .
    5  0 28 62695 62695
    5 12 28 64745 64745
    5 24 28 71095 71095
    end
    Here's the code I've used thus far to generate Sal_1:

    Code:
    gen Sal_1 = .
    bysort Crtria1 (Crtria2) : replace Sal_1 = (Salary) if (Var3 >= Crtria2)
    The salaries that I'm trying to pull into Sal_1 are the salaries that meet the following conditions:

    With the data sorted:
    -- by Crtria1 and then by Crtria2
    Within each group of Crtria1
    -- if Var3 is greater than or equal to Crtria2 of this same observation
    -- AND
    -- if Var3 is less than Crtria2 of the next observation

    The part I cannot figure how to write is: "if Var3 is less than Crtria2 of the next observation"

    So, for example, look at observation 7:
    (This is NOT actually code, it's my attempt to create a table showing results.)
    Code:
          Crtria1   Crtria2       Var3     Salary       Sal_1
    7. |    3             0          14       60,945     60945      
    8. |    3            12         14       63,745     63745
    The goal is to have Sal_1 be "." here because observation 7's Var3 (value of 14) is NOT less than Observation 8's Crtria2 (value of 12).

    Likewise, observations 10, 13 and 14 should have no Sal_1.

    Lastly, I'm trying to do this over several hundred datafiles (using a foreach loop) and the values of each variable vary by datafile.

    So, ... what is a method for replacing based on a following observation?

    Thank you!
    Last edited by James Voss; 27 Sep 2022, 11:40.

  • #2
    James:
    welcome to this forum.
    I'm not sure I got you right; so please consider what follows as a temptative (and hopefully useful) reply in which I kept your -Sal_1- and added a new variable named -Sal_2-:
    Code:
    . gen Sal_2 = .
    (15 missing values generated)
    
    . bysort Crtria1 (Crtria2) : replace Sal_2 = (Salary) if (Var3 >= Crtria2) & (Var3 < Crtria2[_n+1])
    (5 real changes made)
    
    . list
    
         +---------------------------------------------------+
         | Crtria1   Crtria2   Var3   Salary   Sal_1   Sal_2 |
         |---------------------------------------------------|
      1. |       1         0      0    59945   59945   59945 |
      2. |       1        12      0    62745       .       . |
      3. |       1        24      0    69095       .       . |
      4. |       2         0      7    60445   60445   60445 |
      5. |       2        12      7    63236       .       . |
         |---------------------------------------------------|
      6. |       2        24      7    69595       .       . |
      7. |       3         0     14    60945   60945       . |
      8. |       3        12     14    63745   63745   63745 |
      9. |       3        24     14    70095       .       . |
     10. |       4         0     21    61445   61445       . |
         |---------------------------------------------------|
     11. |       4        12     21    64245   64245   64245 |
     12. |       4        24     21    70595       .       . |
     13. |       5         0     28    62695   62695       . |
     14. |       5        12     28    64745   64745       . |
     15. |       5        24     28    71095   71095   71095 |
         +---------------------------------------------------+
    
    .
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Yes. That looks exactly like a solution. And now I need to look up the syntax for var[_n+x].
      Thank you, Carlo, very much.

      Comment

      Working...
      X