Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Identifying and marking change in a string variable.

    Hi,

    I have a string variable '"hh_id". I am trying to identify when the observations change in this string variable by generating a new variable.

    I used the following syntax:

    Code:
    egen tag = tag(hh_id)
    While correctly marking the change in the observations, it is marking only for one quarter. "tag" is incompatible with "by".
    The data is given below:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str3 hh_id str2(qtr month) byte tag
    "ABC" "Q1" "M3" 1
    "ABC" "Q1" "M1" 0
    "ABC" "Q1" "M2" 0
    "ACD" "Q1" "M1" 1
    "ACD" "Q1" "M2" 0
    "ACD" "Q1" "M3" 0
    "ADF" "Q1" "M1" 1
    "XYZ" "Q1" "M1" 1
    "XYZ" "Q1" "M2" 0
    "ABC" "Q2" "M1" 0
    "ABC" "Q2" "M3" 0
    "ABC" "Q2" "M2" 0
    "ACD" "Q2" "M3" 0
    "ACD" "Q2" "M1" 0
    "ACD" "Q2" "M2" 0
    "ADF" "Q2" "M1" 0
    "XYZ" "Q2" "M1" 0
    "XYZ" "Q2" "M2" 0
    end
    Any help will be greatly appreciated.

    Thank you.

  • #2
    egen, tag() is official code but first written by me. It doesn't need to be compatible with by: because any such variable can be added to its argument. egen tag = tag(x) can be changed to egen tag = tag(x y) if groups defined by x y jointly are in question.

    That said, I can't grasp all of what you are doing, partly because holding date information in two string variables is doomed, if not to failure, then to difficulties in coding. With your data, I did this and got this.

    Can you explain what you want now, or what I misunderstood? Once the dates have been tamed, the only string variable left is the identifier.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str3 hh_id str2(qtr month) byte tag
    "ABC" "Q1" "M3" 1
    "ABC" "Q1" "M1" 0
    "ABC" "Q1" "M2" 0
    "ACD" "Q1" "M1" 1
    "ACD" "Q1" "M2" 0
    "ACD" "Q1" "M3" 0
    "ADF" "Q1" "M1" 1
    "XYZ" "Q1" "M1" 1
    "XYZ" "Q1" "M2" 0
    "ABC" "Q2" "M1" 0
    "ABC" "Q2" "M3" 0
    "ABC" "Q2" "M2" 0
    "ACD" "Q2" "M3" 0
    "ACD" "Q2" "M1" 0
    "ACD" "Q2" "M2" 0
    "ADF" "Q2" "M1" 0
    "XYZ" "Q2" "M1" 0
    "XYZ" "Q2" "M2" 0
    end
    
    destring qtr month, ignore(Q M) replace 
    
    gen santosh_date = 3 * (qtr - 1) + month 
    
    tabdisp santosh_date, c(qtr month)  
    
    sort hh_id santosh_date 
    
    list hh_id santosh_date, sepby(hh_id)
    Results:

    Code:
    . tabdisp santosh_date, c(qtr month)  
    
    ----------------------------------
    santosh_d |
    ate       |        qtr       month
    ----------+-----------------------
            1 |          1           1
            2 |          1           2
            3 |          1           3
            4 |          2           1
            5 |          2           2
            6 |          2           3
    ----------------------------------
    
    . 
    . sort hh_id santosh_date 
    
    . 
    . list hh_id santosh_date, sepby(hh_id)
    
         +------------------+
         | hh_id   santos~e |
         |------------------|
      1. |   ABC          1 |
      2. |   ABC          2 |
      3. |   ABC          3 |
      4. |   ABC          4 |
      5. |   ABC          5 |
      6. |   ABC          6 |
         |------------------|
      7. |   ACD          1 |
      8. |   ACD          2 |
      9. |   ACD          3 |
     10. |   ACD          4 |
     11. |   ACD          5 |
     12. |   ACD          6 |
         |------------------|
     13. |   ADF          1 |
     14. |   ADF          4 |
         |------------------|
     15. |   XYZ          1 |
     16. |   XYZ          2 |
     17. |   XYZ          4 |
     18. |   XYZ          5 |
         +------------------+

    Comment

    Working...
    X