Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Empty cells check

    I have a dataset I am trying to clean. It has a number of variables which seem to have no content in most of the dataset....I want to check there is definitely no data there before I delete the variable - I seem to remember reading a data check using the >=1, but cannot remember the command that comes before that!! Can someone help me?

  • #2
    egen hasone=max(!missing(var))
    drop if !hasone


    hth,
    Jeph

    Comment


    • #3
      Thank you....I now need to take some staffgroup details which are stored as strings and turn them into categorized variables....eg there are six possible staff groups so the variable 'staffgroup' has a string value of either A&C 2, A&C 3, NP 5, NP 6, NP 7 etc. How do I make Stata realise they are already categorised and not true strings? I want to be able to then plot staff group to processing time [another variable 'processingtime']

      Comment


      • #4
        Sorry, but I don't understand #3 at all. If no one else does, then you may need to provide a real example.

        On generally checking whether variables are all missing, this can be done with findname (SJ) .

        Code:
        clear 
        set obs 10
        forval j = 1/20 {
             gen x`j' = `j'
        }
        
        replace x13 = .
        
        findname , all(missing(@))
        x13
        The last line above is a result, not a command to be typed.

        Comment


        • #5
          Jeff's suggestion has the problem that it drops observations, not variables. Here is a suggestion that uses first principles and official commands only. r(N) after tabulate is the number of nonmissing values of the last variable tabulated, also for string variables.

          Code:
          foreach V of varlist _all {
           quietly tabulate `V'
           if `r(N)' == 0 drop `V'
          }
          As to Jocelyn's post #3, I also don't understand..

          Comment


          • #6
            Thank you.....#3.....let me try again....maybe it is my poor stata knowledge that is the problem. I think I had misunderstood that Stata recognises all 'A&C 2' staff are the same. Now I can see I can tabulate it so it must know they are identified together....but when I try and do a graph twoway line of processing time versus staff group it says 'string variable not allowed'.....so my question is how do I get Stata to draw the graph I want?

            Comment


            • #7
              Oh...this is it! http://www.ats.ucla.edu/stat/stata/faq/destring.htm

              Comment

              Working...
              X