Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • if loop vs. if suffix

    I'm confused about the difference between using if to group several statements and using if at the end of each statement. It seems to me they should do the same thing, but they don't.

    For example, sometimes I want to replace the values of several variables in a subset of data defined by some variable. It seems to me I should be able to do this:
    Code:
    sysuse auto, clear
    if foreign==1 {
        replace mpg = 0
        replace displacement = 0
    }
    tabstat mpg displacement, by(foreign)
    Which is elegant but doesn't change the values of mpg and displacement at all. Why not?

    Instead, it seems I have to do this:
    Code:
    replace mpg = 0 if foreign==1
    replace displacement = 0 if foreign==1
    tabstat mpg displacement, by(foreign)
    Which is effective but less readable, and would be a pain in the neck if there were more replace statements or a more complicated condition than if foreign==1.

    What's the story here? Why doesn't the first solution do what I expect, and does it do anything at all?

  • #2
    Because the if programming command checks the expression only once, it checks the first observation when you give it a variable. So

    Code:
    if foreign==1 {
        replace mpg = 0
        replace displacement = 0
    }
    Is equivalent to:
    Code:
     if foreign[1]==1 {    
        replace mpg = 0  
        replace displacement = 0
    }
    The first observation of foreign in auto.dta is 0, so it doesn't do anything. If you changed it to -if foreign == 0-, all the values of mpg and displacement would be replaced with 0.

    Comment


    • #3
      Ali has given excellent advice, and is explained under -help ifcmd-. The -if- command should be treated like a command - you pass it a single, static condition to execute (or not) the contents within. The -if- qualifier is dynamically applied over the variable/command of interest.

      In regards to this behaviour, the -if/else- command operates in exactly the same way as any other programming language with which I have some familiarity.

      Comment


      • #4
        Thanks for clarifying this. This seems like really bizarre behavior to me, and unlike Leonardo, I've never encountered similar behavior in another language. SAS for example does not work like this.

        Comment


        • #5
          Ali explained all there is to what is happening.

          I find it helpful to think of

          Code:
          if something {
          do other things
          }
          as a scalar logical check.

          On the other hand
          Code:
          do other things if something
          is a vector logical check which is verified row by row on each observation.

          Plus, in every context if you refer to a variable (a vector) in a scalar context, Stata simply picks the first observation.

          Comment


          • #6
            Originally posted by paulvonhippel View Post
            Thanks for clarifying this. This seems like really bizarre behavior to me, and unlike Leonardo, I've never encountered similar behavior in another language. SAS for example does not work like this.
            What I meant was that programming languages like C, Python, and Java use if/else structures to control the logical flow of program execution, so these decisions points are expected to be static. Stata works the same with it's implementation of the -if/else- command.

            You're right about SAS being different though (and I don't know why it didn't come to mind), but I always consider SAS a beast of a different stripe becuase of its choice to treat steps and procedures as isolated code blocks. In the DATA step, if one uses -if/else- logic, this operates interatively over the program data vector, which is similar in application to the -if- qualifier in Stata. The analog in SAS to the -if/else- command would be the SAS macro %IF/%ELSE. SAS is non-standard in many ways though, which is why I prefer Stata.

            Comment


            • #7
              I have never used SAS. In Matlab and in R the IFs work exactly as "[P] if programming command" works in Stata.

              Comment


              • #8
                Where Stata is puzzling until you work out what is going on using the value in the first observation only.

                The meta-issue I take to be that it is not the job of a syntax parser to think about the meaning of some code and warn you that it is unlikely to be a good idea in Stata terms.

                The same issue is apparent in display where using display with a bare variable name yields the value in the first observation, which could be useful in some circumstances.

                There is an FAQ on this point, although as I've often remarked the supposed question is usually the answer to a different question.

                https://www.stata.com/support/faqs/p...-if-qualifier/

                Comment

                Working...
                X