Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • deleting variables with the same values

    Hello everyone,

    I would like to delete all variables of a dataset, in which all observations have the same value (-94).

    My approach was the following:

    Code:
    local abc *all variables of the dataset*
    
    tostring `abc', replace force
    
    foreach var of local abc {
    replace `abc' = "" if `abc' == "-94"
    }
    Here I get a Type Missmatch error!

    Otherwise I would have continued with the dropmiss command.

    Some information to the Dataset. Is has 1900 variables but only 106 Observations. The variables are both, string and numerical, hence I had to do the tostring command in the beginning.

    My Question:
    What caused the Type missmatch and how can I prevent it from happening?
    And do you maybe know another way to delete a great amount of variables which do not have missing values (but as I mentioned, always the same value)?

    Thank you for your time!

  • #2
    You say that your variables are both string and numeric, but you are applying the -tostring- command to each variable, and it can't be applied to a string. That makes me think that the syntax you show is not exactly what you used because that problem would have led to an error message. However, -tostring-, when applied to a string, gives a different message than "type mismatch." The latter error message would come when you tried to assign "-94" to some numeric variable. This would indicate that some of your variables were still numeric when you tried the -foreach- loop. To detect this problem, try the following right before your -foreach- loop:
    Code:
    ds `abc', has(type numeric)
    This will reveal which variables contradict your assumption that all your variables have been converted to string.

    All this being said: Your attempt to check if `abc' == "-94" may not work because of embedded blanks. You should look at -help strtrm- and help -strpos-, which would provide ways to eliminate blanks or to search for a string without checking for exact equality.

    Comment


    • #3
      This was also posted on Stack Overflow and answered there. Please note our policy on cross-posting, which is that you are asked to tell us about it.

      Comment


      • #4
        The SO thread is https://stackoverflow.com/questions/...l-observations

        There remains the question of the error in #1. The code can be simplified and corrected as follows:

        Code:
        tostring *, replace force
        foreach v of var * {    
            replace `v' = "" if `v' == "-94"
        }
        However, that isn't recommended at all.

        There is no need to convert all the variables to string.

        As in the rest of life, force is the last resort of the desperate.

        Replacing all instances of the string "-94" with missing won't drop such variables, although intent was signalled to use the obsolete command dropmiss (from Stata Journal, as you are asked to explain).

        Better ideas, in my opinion, can be found in the SO thread just cited.

        Further notes:

        Mike Lacy in #2 wrote


        You say that your variables are both string and numeric, but you are applying the -tostring- command to each variable, and it can't be applied to a string.
        That's not correct. tostring just notes whenever a variable is string already, and passes by, there being nothing to do.

        That makes me think that the syntax you show is not exactly what you used because that problem would have led to an error message.
        I find Marius' declared syntax plausible (if wrong). The error is that the replace statement in the loop should not start


        Code:
         replace `abc'
        but

        Code:
         replace `var'
        Last edited by Nick Cox; 13 Sep 2019, 10:00.

        Comment

        Working...
        X