Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • For a unique ID, how to identify if any value of column 1 matches a value in column 2?

    I have a dataset which looks like this -

    Unique_ID Years BirthYear
    1. 1970. 1972
    1 1971. 1972
    1 1972. 1973
    1 1973. 1972
    1 1974. 1973
    2. 1970. 1971
    2 1971. 1971
    2 1972. 1971
    2 1973. 1974
    2 1974. 1971

    Here, the unique ID corresponds to a unique mother and years corrospond to 1970-1974 for every unique mother's ID. Birthyear is the year when that mother has given a birth. I achieved this dataset by using expand function on an existing dataset, so the birthyear got repeated.

    Now I want to create a variable, Value which looks like this

    Unique_ID Years BirthYear Value
    1. 1970. 1972 0
    1 1971. 1972 0
    1 1972. 1973 1
    1 1973. 1972 1
    1 1974. 1973 0
    2. 1970. 1971 0
    2 1971. 1971 1
    2 1972. 1971 0
    2 1973. 1974 0
    2 1974. 1971 1


    However, I am unable to create this because the formula, gen Value = 1 if Years == Birthyear does not work as BirthYear can be in a row different than the Year.
    So, I want to find a way which can first of all identify the Unique_ID 1 and then within that can identify BirthYear value and match it to all the values applicable in Years to be able to create the Value column as above.

    Thank you so much in advance for your help.

  • #2
    There may be another more direct way... I would approach this through merging:

    Code:
    clear
    input Unique_ID Years BirthYear
    1 1970 1972
    1 1971 1972
    1 1972 1973
    1 1973 1972
    1 1974 1973
    2 1970 1971
    2 1971 1971
    2 1972 1971
    2 1973 1974
    2 1974 1971
    end
    
    preserve
    keep Unique_ID BirthYear
    rename BirthYear Years
    gen wanted = 1
    collapse (mean) wanted, by(Unique_ID Years)
    save temp_file, replace
    restore
    
    merge m:1 Unique_ID Years using temp_file, nogen
    replace wanted = 0 if missing(wanted)
    list
    Results:
    Code:
         +--------------------------------------+
         | Unique~D   Years   BirthY~r   wanted |
         |--------------------------------------|
      1. |        1    1970       1972        0 |
      2. |        1    1971       1972        0 |
      3. |        1    1972       1973        1 |
      4. |        1    1973       1972        1 |
      5. |        1    1974       1973        0 |
         |--------------------------------------|
      6. |        2    1970       1971        0 |
      7. |        2    1971       1971        1 |
      8. |        2    1972       1971        0 |
      9. |        2    1973       1974        0 |
     10. |        2    1974       1971        1 |
         +--------------------------------------+

    Comment


    • #3
      Here's another way to do it in place. I use rangestat from SSC.

      Code:
      clear 
      input Unique_ID Years BirthYear
      1. 1970. 1972
      1 1971. 1972
      1 1972. 1973
      1 1973. 1972
      1 1974. 1973
      2. 1970. 1971
      2 1971. 1971
      2 1972. 1971
      2 1973. 1974
      2 1974. 1971
      end 
      
      rangestat (count) count=Years, by(Unique_ID) int(BirthYear Years Years)
      
      gen wanted = count < . 
      
      list 
      
           +----------------------------------------------+
           | Unique~D   Years   BirthY~r   count   wanted |
           |----------------------------------------------|
        1. |        1    1970       1972       .        0 |
        2. |        1    1971       1972       .        0 |
        3. |        1    1972       1973       3        1 |
        4. |        1    1973       1972       2        1 |
        5. |        1    1974       1973       .        0 |
           |----------------------------------------------|
        6. |        2    1970       1971       .        0 |
        7. |        2    1971       1971       4        1 |
        8. |        2    1972       1971       .        0 |
        9. |        2    1973       1974       .        0 |
       10. |        2    1974       1971       1        1 |
           +----------------------------------------------+

      Comment

      Working...
      X