Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • search if a value is present in table

    Hello all,

    How can I search if a particular value is present in a table.
    if the value is present in the table , create a new column "result" and make that row as "1"

    Example:

    TABLE
    Column1 Column2 Column3 Column4
    ZA112234 CA52167 PK09146 VF90103
    ZA114234 CA52564 K091743 VF90115
    ZA115432 CA52734 JI090803 CA52012
    ZA116444 CA52923 KL02134 CA52173
    ZA117642 CA52064 SD13463 CA52212
    ZA118543 CA52231 IU46366 VF90122
    ZA119654 CA52341 GHJ3454 ZA11199
    HK233543 LK23426 LK34534 ZA11275
    ZA111298 CA52086 UJ54352 KL678925
    ZA112897 CA52112 PK07762 ZA11375
    ZA112777 CA52175 PK09123 VF90139
    ZA119057 CA52187 PK00264 ZA11455
    ZK012397 GF32431 PK09132 VF90149
    ZA112532 CA52999 PK09431 VF90150
    KK110998 BB52563 JJ09567 FFF9016E























    In this above table Need to search for "LK23426", If we find that value in the table then mark it as "1" in the result column.

    Column1 Column2 Column3 Column4 result
    ZA112234 CA52167 PK09146 VF90103 0
    ZA114234 CA52564 K091743 VF90115 0
    ZA115432 CA52734 JI090803 CA52012 0
    ZA116444 CA52923 KL02134 CA52173 0
    ZA117642 CA52064 SD13463 CA52212 0
    ZA118543 CA52231 IU46366 VF90122 0
    ZA119654 CA52341 GHJ3454 ZA11199 0
    HK233543 LK23426 LK34534 ZA11275 1
    ZA111298 CA52086 UJ54352 KL678925 0
    ZA112897 CA52112 PK07762 ZA11375 0
    ZA112777 CA52175 PK09123 VF90139 0
    ZA119057 CA52187 PK00264 ZA11455 0
    ZK012397 GF32431 PK09132 VF90149 0
    ZA112532 CA52999 PK09431 VF90150 0
























    In general can we do the same, but for more number values i.e to search for more values like "LK23426", "UJ54352", "ZA119057". So that the result table should look like something like this.
    Column1 Column2 Column3 Column4 result
    ZA112234 CA52167 PK09146 VF90103 0
    ZA114234 CA52564 K091743 VF90115 0
    ZA115432 CA52734 JI090803 CA52012 0
    ZA116444 CA52923 KL02134 CA52173 0
    ZA117642 CA52064 SD13463 CA52212 0
    ZA118543 CA52231 IU46366 VF90122 0
    ZA119654 CA52341 GHJ3454 ZA11199 0
    HK233543 LK23426 LK34534 ZA11275 1
    ZA111298 CA52086 UJ54352 KL678925 1
    ZA112897 CA52112 PK07762 ZA11375 0
    ZA112777 CA52175 PK09123 VF90139 0
    ZA119057 CA52187 PK00264 ZA11455 1
    ZK012397 GF32431 PK09132 VF90149 0
    ZA112532 CA52999 PK09431 VF90150 0
























    Any help would be greatly appreciated,
    Thanks in advance!!!





  • #2
    One approach

    Code:
    clear
    input str8 column1 str7 column2 str8(column3 column4)
    "ZA112234" "CA52167" "PK09146"  "VF90103"
    "ZA114234" "CA52564" "K091743"  "VF90115"
    "ZA115432" "CA52734" "JI090803" "CA52012"
    "ZA116444" "CA52923" "KL02134"  "CA52173"
    "ZA117642" "CA52064" "SD13463"  "CA52212"
    "ZA118543" "CA52231" "IU46366"  "VF90122"
    "ZA119654" "CA52341" "GHJ3454"  "ZA11199"
    "HK233543" "LK23426" "LK34534"  "ZA11275"
    "ZA111298" "CA52086" "UJ54352"  "KL678925"
    "ZA112897" "CA52112" "PK07762"  "ZA11375"
    "ZA112777" "CA52175" "PK09123"  "VF90139"
    "ZA119057" "CA52187" "PK00264"  "ZA11455"
    "ZK012397" "GF32431" "PK09132"  "VF90149"
    "ZA112532" "CA52999" "PK09431"  "VF90150"
    end
    
    gen result = 0
    
    ds c*
    foreach x in `r(varlist)' {
        replace result = 1 if ustrregexm(trim(upper(`x')),"ZA119057|LK23426|UJ54352")
    }

    Comment


    • #3
      Perhaps the problem you showed us was a smaller version of a much larger problem, with dozens of values to try to match. In that case, the following approach may be helpful.
      Code:
      clear
      input str8 column1 str7 column2 str8(column3 column4)
      "ZA112234" "CA52167" "PK09146"  "VF90103"
      "ZA114234" "CA52564" "K091743"  "VF90115"
      "ZA115432" "CA52734" "JI090803" "CA52012"
      "ZA116444" "CA52923" "KL02134"  "CA52173"
      "ZA117642" "CA52064" "SD13463"  "CA52212"
      "ZA118543" "CA52231" "IU46366"  "VF90122"
      "ZA119654" "CA52341" "GHJ3454"  "ZA11199"
      "HK233543" "LK23426" "LK34534"  "ZA11275"
      "ZA111298" "CA52086" "UJ54352"  "KL678925"
      "ZA112897" "CA52112" "PK07762"  "ZA11375"
      "ZA112777" "CA52175" "PK09123"  "VF90139"
      "ZA119057" "CA52187" "PK00264"  "ZA11455"
      "ZK012397" "GF32431" "PK09132"  "VF90149"
      "ZA112532" "CA52999" "PK09431"  "VF90150"
      end
      save table, replace
      
      clear
      input str8 column
      "ZA119057"
      "LK23426"
      "UJ54352"
      end
      save compare, replace
      
      use table, clear
      generate long seq = _n
      reshape long column, i(seq) j(cno)
      merge m:1 column using compare, keep(master match)
      by seq, sort: egen result = max(_merge==3)
      drop _merge
      reshape wide column, i(seq) j(cno)
      drop seq
      list, clean noobs
      Code:
      . use table, clear
      
      . generate long seq = _n
      
      . reshape long column, i(seq) j(cno)
      (j = 1 2 3 4)
      
      Data                               Wide   ->   Long
      -----------------------------------------------------------------------------
      Number of observations               14   ->   56          
      Number of variables                   5   ->   3           
      j variable (4 values)                     ->   cno
      xij variables:
                  column1 column2 ... column4   ->   column
      -----------------------------------------------------------------------------
      
      . merge m:1 column using compare, keep(master match)
      
          Result                      Number of obs
          -----------------------------------------
          Not matched                            53
              from master                        53  (_merge==1)
              from using                          0  (_merge==2)
      
          Matched                                 3  (_merge==3)
          -----------------------------------------
      
      . by seq, sort: egen result = max(_merge==3)
      
      . drop _merge
      
      . reshape wide column, i(seq) j(cno)
      (j = 1 2 3 4)
      
      Data                               Long   ->   Wide
      -----------------------------------------------------------------------------
      Number of observations               56   ->   14          
      Number of variables                   4   ->   6           
      j variable (4 values)               cno   ->   (dropped)
      xij variables:
                                       column   ->   column1 column2 ... column4
      -----------------------------------------------------------------------------
      
      . drop seq
      
      . list, clean noobs
      
           column1   column2    column3    column4   result  
          ZA112234   CA52167    PK09146    VF90103        0  
          ZA114234   CA52564    K091743    VF90115        0  
          ZA115432   CA52734   JI090803    CA52012        0  
          ZA116444   CA52923    KL02134    CA52173        0  
          ZA117642   CA52064    SD13463    CA52212        0  
          ZA118543   CA52231    IU46366    VF90122        0  
          ZA119654   CA52341    GHJ3454    ZA11199        0  
          HK233543   LK23426    LK34534    ZA11275        1  
          ZA111298   CA52086    UJ54352   KL678925        1  
          ZA112897   CA52112    PK07762    ZA11375        0  
          ZA112777   CA52175    PK09123    VF90139        0  
          ZA119057   CA52187    PK00264    ZA11455        1  
          ZK012397   GF32431    PK09132    VF90149        0  
          ZA112532   CA52999    PK09431    VF90150        0  
      
      .

      Comment


      • #4
        Thank you so much @Justin Niakamal and @William Lisowski. That was really very helpful.

        Comment

        Working...
        X