Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Incorrect code

    I have this data which has around 10,000 observations

    id var1 var2 var3
    1 P P 0
    2 1 1 2
    3 P 0 0
    4 A P P
    5 P 2 0
    ....
    ....

    and so on

    var1 var2 and var3 are string variables

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float id str1(var1 var2 var3)
    1 "P" "P" "0"
    2 "1" "1" "2"
    3 "P" "0" "0"
    4 "A" "P" "P"
    5 "P" "2" "0"
    end


    I want to identify those individuals for whom there is a "0" after a "P" i.e. "P" and "0" are observed consecutively
    lets say the name of new variable is p0
    then the new variable will look like this:

    id var1 var2 var3 p0
    1 P P 0 1
    2 1 1 2 0
    3 P 0 0 1
    4 A P P 0
    5 P 2 0 0

    you can see individuals 1 has a a p0 value 1 because "P" and "0" are observed consecutively in var2 and var3 similarly individual 3 has a p0 value 1 because var1 value is "P" and var2 value is "0" so "P" and "0" has come consecutively. However, individuals 5 has a p0 value 0 because "P" and "0" has not come consecutively, var1 is P var2 is "2" and var3 is "0"

    to do this I have written this code:

    Code:
    gen p0 = 0
    forval i = 1/2 {
        if p0 == 0 {
            if var`i' == "P" & var`i+1' == "0" {
                replace p0 = 1
            }
    }
    }

    but this code is not able to identify the individuals

    I have tried this also:

    Code:
    gen p0 = 0
    
    forval i = 1/2 {
        replace p0 = 1 if var`i' == "P" & var`i+1' == "0" & p0 == 0
    }
    this also not working
    can anyone solve this problem
    Last edited by Akif Alig; 27 Apr 2024, 00:24.

  • #2
    No loop is needed.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float id str1(var1 var2 var3)
    1 "P" "P" "0"
    2 "1" "1" "2"
    3 "P" "0" "0"
    4 "A" "P" "P"
    5 "P" "2" "0"
    end
    
    
    gen wanted = strpos(var1 + var2 + var3, "P0") > 0
    
    list
    
         +----------------------------------+
         | id   var1   var2   var3   wanted |
         |----------------------------------|
      1. |  1      P      P      0        1 |
      2. |  2      1      1      2        0 |
      3. |  3      P      0      0        1 |
      4. |  4      A      P      P        0 |
      5. |  5      P      2      0        0 |
         +----------------------------------+
    With just three variables, this would work:

    Code:
    gen wanted2 = (var1 == "P" & var2 == "0") | (var2 == "P" & var3 == "0")
    Your second code block can be made to work too:

    Code:
    gen p0 = 0
    
    forval i = 1/2 {
        local j = `i' + 1
        replace p0 = 1 if var`i' == "P" & var`j' == "0"
    }
    or (NB how to increment local macros on the fly)

    Code:
    gen p0 = 0
    
    forval i = 1/2 {
         replace p0 = 1 if var`i' == "P" & var`= `i' + 1' == "0"
    }
    The first code block confuses what the if command does and what the if qualifier does, and there are some related errors.

    Notably,

    Code:
    if var`i' == "P"
    is always and only ever interpreted with reference to the first observation, as if it were
    Code:
      
     if var`i'[1] == "P"
    Otherwise put, the forval loop over variables is not somehow also a loop over observations too. (I believe this can be true in Some Alternative Software, but I've never used that.)

    The difference between if and if, as it were, is a long and subtle story, but see


    SJ-23-2 st0721 . When to use the if qualifier and when to use the if command
    . . . . . . . . . . . . . . . . . . . . N. J. Cox and C. B. Schechter
    Q2/23 SJ 23(2):589--594 (no commands)
    discusses generally when you should use either the if
    qualifier or an if command and specifically flags a
    common pitfall in using the if command

    If you don't have access to that, there is a longstanding FAQ

    I have an if or while command in my program that only seems to evaluate the first observation. What’s going on?

    https://www.stata.com/support/faqs/p...-if-qualifier/

    To my mind the FAQ is written backwards, as the title is the answer, not the question most people have (in my experience). That question is more often

    The if command isn't working as I expect. Is this my bug or Stata's?

    and even more often

    The if command isn't working as I expect. Is this a bug in Stata?




    Last edited by Nick Cox; 27 Apr 2024, 02:11.

    Comment


    • #3
      Thank you so much Nick.
      my codes certainly reflect my poor coding capabilities

      Comment


      • #4
        Don't say that. Capability is evident. The if and if business has bitten many people. I have to say that StataCorp could document it a little better.

        Here is another way to do it:


        Code:
        gen wanted3 = inlist("P0", var1 + var2, var2 + var3)

        Comment


        • #5
          Thank you so much for your guidance Nick

          Comment

          Working...
          X