Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Dropping Multiple variables at once if empty does not work

    Hi, I am using Stata 14, and i'm trying to drop rows where observations are for some dates are empty: So I used:

    drop if fy1beginningdate fy2beginningdate fy3beginningdate fy4beginningdate=="", but got the error "type mismatch" r(109)

    They are all of type str10.

    However it worked fine when I tried:
    drop if fy1beginningdate==""
    drop if fy2beginningdate==""
    drop if fy3beginningdate==""
    drop if fy4beginningdate==""

    I just still want to find out why the first one didn't work, perhaps I am getting the syntax wrong

  • #2
    Hi,
    can you try:
    drop if fy1beginningdata == .

    Comment


    • #3
      The reason why your first command did not work is simply that you forgot to concatenate the four if conditions appropriately. Correct syntax would be (one of many alternatives):

      Code:
      drop if fy1beginningdate == "" | fy2beginningdate == "" | fy3beginningdate == "" | fy4beginningdate==""
      which means that the observation is to be deleted if any of the string variables is empty. You could also have a look at egen rowmiss. Type help egen.

      Comment


      • #4
        Indeed; the syntax is quite wrong. What you typed does not qualify as an expression which can be evaluated as true or false . The error message is not easy to interpret, but what you typed is a long way from correct syntax and Stata is just puzzled. It is probably stopping at

        Code:
         
        drop if fy1beginningdate
        and seeing a string where it expects a numeric result. It then bails out without trying anything else.

        More crucially, is it enough for any of these variables to be missing or must they all be missing?

        Either way, I would approach the problem like this.

        Code:
         
        egen nmissing = rowmiss(fy1beginningdate fy2beginningdate fy3beginningdate fy4beginningdate)
        which may well be the same in your dataset as

        Code:
         
        egen nmissing = rowmiss(fy?beginningdate)
        and then your condition for dropping can be expressed in terms of that variable.

        Incidentally, it sounds as it you have a wide data structure. At some point a reshape long is likely to be advisable.

        Comment


        • #5
          Thank you so much for all your comments. It's interesting learning much more precise ways of writing Stata syntax. Thanks Nick, the egen is very precise and saves me a lot of lines
          Last edited by Chiemeka Amadi; 05 Apr 2016, 14:42.

          Comment


          • #6
            Indeed Nick, my data is in wide format for 4 years. However I am still working with it in wide format, because I have so many variables with different stems that it would take a very long time to select each and every one when using the reshape command. So far, I wasn't able to find something like "variableStem*" to select multiple variables for reshape

            Comment


            • #7
              There's an @ syntax for your case. Look again at the help for reshape.

              Comment

              Working...
              X