Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • replace string variables with missing if they don't contain certain strings

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float id str3(op1 op2 op3 op4 op5)
    45 "f12" "f12" "f14" "f15" "f16"
    54 "f12" "g23" "g32" "g21" "g11"
    76 "f12" "g21" "g32" "g91" "f99"
    end
    I want to replace op1-op5 as missing if they dont contain "f12" or g"32" or "g11"
    The following doesnt work. Please help.

    foreach x in 1 2 3 4 5 {
    replace op`x'="" if !strpos(op`x', "f12") | !strpos(op`x', "g32")
    }
    Last edited by Nishan Lamichhane; 01 Oct 2023, 04:29.

  • #2
    This may help.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float id str3(op1 op2 op3 op4 op5)
    45 "f12" "f12" "f14" "f15" "f16"
    54 "f12" "g23" "g32" "g21" "g11"
    76 "f12" "g21" "g32" "g91" "f99"
    end
    
    forval j = 1/5 { 
        replace op`j' = "" if !inlist(op`j', "f12", "g32", "g11")
    }
    
    list 
    
         +----------------------------------+
         | id   op1   op2   op3   op4   op5 |
         |----------------------------------|
      1. | 45   f12   f12                   |
      2. | 54   f12         g32         g11 |
      3. | 76   f12         g32             |
         +----------------------------------+

    Comment


    • #3
      Thank you Nick. It worked perfectly for me.

      Now I want to create a new variable op_final which will be replaced first non-empty occurrence from each row (individual/id). For example

      Code:
      gen str3 op_final=""
      forval k=1/5 {
      replace op_final= THE FIRST NON_EMPTY op1-op5.

      I hope my query is understandable.

      Comment


      • #4
        Code:
        gen firstnm = op1 
        
        forval j = 2/5 { 
            replace firstnm = op`j' if missing(firstnm)
        }
        Once firstnm is populated with a non-missing value, further change is impossible. Replacing missings by missings is naturally not a problem, although some might prefer more long-winded code with

        Code:
        replace firstnm = op`j' if missing(firstnm) & !missing(op`j')
        Note also that you can initialize safely to the first variable.

        There is much more detailed advice within https://journals.sagepub.com/doi/pdf...867X0900900107 (note also the sequel at https://journals.sagepub.com/doi/pdf...36867X20931007)





        Comment

        Working...
        X