Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Drop variables based on observations

    Code:
        +----------------------------------------------------------+
         |    A         B           C           D        Q        U |
         |----------------------------------------------------------|
      1. | Name   AB BANK   BANK ASIA   CITY BANK   #ERROR   #ERROR |
         +----------------------------------------------------------+
    Q1. How do I delete variables which contain "#ERROR"? In this example, the variables are Q and U. In the actual data set, there are many more. So, I want a general solution.

    After deleting the variables, I want to make observation 1 as the variable names using the following code.
    Code:
    foreach var of varlist * {
        rename `var' `=strtoname(`var'[1])'
    }
    
    drop in 1
    Q2. After destringing the BANKs and renaming Name as Date, how do I keep only those variables that have values greater than 110 in the last observation? In this example, "AB BANK" and "CITY BANK" have values greater than 110 in the last observation.
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str10 A str17(B C D) str38(Q U)
    "Name"       "AB BANK"           "BANK ASIA"         "CITY BANK" "#ERROR"                                 "#ERROR"                                
    "  1/1/2014" "100"               "100"               "100"       "$$ER: 2380,NO DATA IN REQUESTED PERIOD" "$$ER: 2380,NO DATA IN REQUESTED PERIOD"
    "  1/2/2014" "99.23"             "97.73"             "100"       ""                                       ""                                      
    "  1/3/2014" "99.23"             "97.73"             "100"       ""                                       ""                                      
    "  1/6/2014" "98.84999999999999" "91.36"             "97.47"     ""                                       ""                                      
    "  1/7/2014" "98.08"             "94.09"             "96.97"     ""                                       ""                                      
    "  1/8/2014" "99.62000000000001" "93.64"             "100.51"    ""                                       ""                                      
    "  1/9/2014" "99.23"             "93.64"             "100.51"    ""                                       ""                                      
    " 1/10/2014" "99.23"             "90.91"             "99.49"     ""                                       ""                                      
    " 1/13/2014" "101.53"            "93.64"             "105.05"    ""                                       ""                                      
    " 1/14/2014" "101.53"            "93.64"             "105.05"    ""                                       ""                                      
    " 1/15/2014" "102.68"            "95.45"             "105.56"    ""                                       ""                                      
    " 1/16/2014" "105.75"            "96.81999999999999" "105.05"    ""                                       ""                                      
    " 1/17/2014" "106.9"             "102.27"            "106.57"    ""                                       ""                                      
    " 1/20/2014" "104.98"            "98.64"             "102.53"    ""                                       ""                                      
    " 1/21/2014" "104.98"            "97.27"             "101.01"    ""                                       ""                                      
    " 1/22/2014" "106.51"            "96.81999999999999" "102.02"    ""                                       ""                                      
    " 1/23/2014" "110.73"            "99.09"             "102.02"    ""                                       ""                                      
    " 1/24/2014" "116.48"            "100.45"            "107.58"    ""                                       ""                                      
    " 1/27/2014" "116.48"            "102.73"            "110.1"     ""                                       ""                                      
    end

  • #2
    This is not tested, but should be close.
    Q1:
    Code:
    foreach v of varlist * {
       count if strpos(`v', "#ERROR")
       if (r(N) > 0) drop `v'
    }
    Q2:
    Code:
    ds *, has(type numeric)
    foreach v of varlist * {
        if (`v'[_N] <= 110) drop `v'
    }

    Comment


    • #3
      Let me add that you do not want to stop at renaming the string variable Name to Date. You want to follow the instructions in the output of help datetime to convert it to a Stata Internal Format daily date value. Building on your code in post #1 and Mike's in post #2, I have the following example code run on your example data. Note that, unlike Mike, I am assuming that the "#ERROR" will occur only in the first observation, based on similar extracts I have seen in other posts.
      Code:
      foreach var of varlist * {
          if `var'[1]=="#ERROR" drop `var'
      }
      
      foreach var of varlist * {
          local var1 = `var'[1]
          if "`var1'" == "Name" {
              generate Date = daily(`var',"MDY") in 2/l
              format Date %td
              order Date
              drop `var'
          }
          else {
              quietly replace `var' = "-999" in 1
              local newvar = strtoname("`var1'")
              destring `var', generate(`newvar')
              drop `var'
          }
      }
      drop in 1
      describe
      list, clean noobs abbreviate(12)
      Code:
      . describe
      
      Contains data
        obs:            19                          
       vars:             4                          
       size:           532                          
      ------------------------------------------------------------------------------------------------
                    storage   display    value
      variable name   type    format     label      variable label
      ------------------------------------------------------------------------------------------------
      Date            float   %td                   
      AB_BANK         double  %10.0g                
      BANK_ASIA       double  %10.0g                
      CITY_BANK       double  %10.0g                
      ------------------------------------------------------------------------------------------------
      Sorted by: 
           Note: Dataset has changed since last saved.
      
      . list, clean noobs abbreviate(12)
      
               Date   AB_BANK   BANK_ASIA   CITY_BANK  
          01jan2014       100         100         100  
          02jan2014     99.23       97.73         100  
          03jan2014     99.23       97.73         100  
          06jan2014     98.85       91.36       97.47  
          07jan2014     98.08       94.09       96.97  
          08jan2014     99.62       93.64      100.51  
          09jan2014     99.23       93.64      100.51  
          10jan2014     99.23       90.91       99.49  
          13jan2014    101.53       93.64      105.05  
          14jan2014    101.53       93.64      105.05  
          15jan2014    102.68       95.45      105.56  
          16jan2014    105.75       96.82      105.05  
          17jan2014     106.9      102.27      106.57  
          20jan2014    104.98       98.64      102.53  
          21jan2014    104.98       97.27      101.01  
          22jan2014    106.51       96.82      102.02  
          23jan2014    110.73       99.09      102.02  
          24jan2014    116.48      100.45      107.58  
          27jan2014    116.48      102.73       110.1  
      
      .

      Comment


      • #4
        Many thanks Mike Lacy and William Lisowski .Your codes are efficacious!

        Comment

        Working...
        X