Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Do file beginner question


    I am trying to improve my understanding of do file scripting. For disclosure, I program competently in eg R, Python, C.

    so, my eventual aim is to reproduce some of my R wrangling scripts. One thing I do in R is programatically check the types of variables and raise a flag if they aren't as expected.

    so, I have written this code as my first step on the way (I include my baby programmer comments):

    Code:
    * loop over the variables specified
    foreach v of varlist surname-history {
    * test that variable is numeric, don't stop on fail
        capture confirm numeric variable `v'
    * if the return code doesn't indicate success write out it's a string (if !_rc is if _rc is not true) 
     if !_rc {
            disp "this is a string"
        }
    * otherwise write out that it is a number
        else {
            disp "This is a number"
        
        }
    
    }
    The first few lines of my data are like:
    1. ALI 2 1 52 46 35
    2. BLAKEMORE 2 1 56 38 40
    3. RAMANI 1 3 42 43 40
    4. ROWLANDS 1 2 47 50 48
    5. DRURY 2 2 50 50 49
    I have checked the types and they are as expected - only surname is a string

    I get 'this is a number' six lines in succession as output. No error message. So, correctly six of my variables are identified as numeric, the return code tests and it prints the message. But, although - I assume - it is failing on the first variable, it doesn't display the message.

    My instinct is that I misunderstand _rc, but I can't see what is wrong.

    I would be really grateful for any help!
    Last edited by Jim Tyson; 14 Aug 2020, 06:52.

  • #2
    I think most Stata beginners would be very proud to have written that code. (Sincerely!)

    I don't see a question here, however, but I have a comment. What's annoying and frustrating, but also quite exciting, is that different software often has quite different styles. (I am only an occasional user of R, but I see questions elsewhere on the equivalent in R of some Stata feature where the answer is often that there isn't any equivalent, and no need for one because R has a different way of thinking about the problem altogether.)

    Here your code could be enhanced to print the variable too, but i would suggest instead using ds as in

    Code:
    . sysuse auto, clear
    (1978 Automobile Data)
    
    . ds, has(type string)
    make
    
    . ds, has(type numeric)
    price         rep78         trunk         length        displacement  foreign
    mpg           headroom      weight        turn          gear_ratio


    EDIT: I see a question now. I will respond separately.

    Last edited by Nick Cox; 14 Aug 2020, 07:13.

    Comment


    • #3
      Here is your data example and edited code showing two ways to do it, all in one. _rc is positive if there was an error and zero otherwise. Logical negation makes positive values zero.

      Code:
      clear 
      input str9 surname y x math geog history 
          ALI    2    1    52    46    35
          BLAKEMORE    2    1    56    38    40
          RAMANI    1    3    42    43    40
          ROWLANDS    1    2    47    50    48
          DRURY    2    2    50    50    49
      end
          
      * loop over the variables specified
      foreach v of varlist surname-history {
          * test that variable is numeric, don't stop on fail
          capture confirm numeric variable `v'
          * if the return code doesn't indicate success write out it's a string (if !_rc is if _rc is not true) 
          if _rc disp "`v' is a string"
          * otherwise write out that it is a number
          else disp "`v' is a number" 
      }
      
      
      * loop over the variables specified
      foreach v of varlist surname-history {
          capture confirm numeric variable `v'
          display "`v' is a " cond(_rc, "string", "number")
      }

      Comment


      • #4
        Nick will provide a detailed answer. Here is the short form:

        _rc (return code) is set to 0 (which, in Stata, means "false") if there is no error. So your code is indeed flipped the wrong way.

        Aside from that, note that while very useful, capture is also dangerous. capture eats any error messages, including possible typos in your code. For debugging and testing, a better alternative is often

        Code:
        capture noisily
        In a programming context, you might want to make use of the specific values of _rc that are documented in [P] error.


        Edit:

        OK, Nick did not provide a lengthy answer but shows how to make the code work. I have one more thing to add:

        In the future, please use dataex to present your dataset because it contains all the relevant information. Here is a minor tweak of Nick's example data that would puzzle even experienced Stata users (well, at least for a second or two). Can you guess what is wrong here:

        Code:
        . list
        
             +-----------+
             |   surname |
             |-----------|
          1. |       ALI |
          2. | BLAKEMORE |
          3. |    RAMANI |
          4. |  ROWLANDS |
          5. |     DRURY |
             +-----------+
        
        . confirm string variable surname
        'surname' found where string variable expected
        r(7);
        The answer is: nothing is wrong, except, perhaps, my expectation. Here is how the example looks using dataex:

        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input byte surname
        1
        2
        3
        4
        5
        end
        label values surname surname_lbl
        label def surname_lbl 1 "ALI", modify
        label def surname_lbl 2 "BLAKEMORE", modify
        label def surname_lbl 3 "RAMANI", modify
        label def surname_lbl 4 "ROWLANDS", modify
        label def surname_lbl 5 "DRURY", modify
        At least to the experienced Stata user, the answer is now obvious.
        Last edited by daniel klein; 14 Aug 2020, 07:44. Reason: Now even my short-form answer appeared after Nicks ...

        Comment


        • #5
          Thank you Daniel and Nick. That cleared things up code wise and understaning wise. The ref to [P] error is very useful - I hadn't found an explanation of the codes myself until this. And thanks for explaining about dataex - I will investigate before attempting the extra credit question.

          Comment

          Working...
          X