Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Exiting a loop after a condition is satisfied

    Hi,

    I have a duplicated observation on the column on which I would like to do a merge.
    So I would like to loop over the rows and replace the first occurrence by adding a "rev_" prefix, and the exit the loop.
    Here is the code I wrote, it replaces both the occurrences. Can you help with this?


    Code:
    foreach var of varlist _all {
    replace `var' = "rev_string" if strmatch(`var' , "string")
    continue, break
    }
    Thanks
    Last edited by lamya kejji; 18 May 2016, 08:43.

  • #2
    I am quite puzzled by your indicating that duplication occurs on one variable (Stata vocabulary: column == variable), since your code loops over multiple variables. If you do want to detect duplicates on a single variable, I'd suggest you take a look at -help duplicates- .

    Comment


    • #3
      copies observations surplus
      1 31 0
      2 2 1
      This is the result when I execute the command duplicate report var
      Last edited by lamya kejji; 18 May 2016, 08:50.

      Comment


      • #4
        Originally posted by Mike Lacy View Post
        I am quite puzzled by your indicating that duplication occurs on one variable (Stata vocabulary: column == variable), since your code loops over multiple variables. If you do want to detect duplicates on a single variable, I'd suggest you take a look at -help duplicates- .
        I see what you mean Mike, you are right the code should be :
        Code:
        foreach val in A {
        replace `val' = "rev_string" if strmatch(`val' , "string")
        }

        Comment


        • #5
          But it still replaces both the observations, I would like to keep both of them, by adding a prefix to the first occurrence in the A variable.

          Comment


          • #6
            A loop here over one variable is harmless but pointless, so let's focus on


            Code:
              
            replace A = "rev_string" if strmatch(A , "string")
            If you want that to work only in the first occurrence of the specified string, you can look for it manually by

            Code:
            list A if strmatch(A, "string")
            and changing the value in just the first observation found.

            Otherwise, see

            http://www.stata-journal.com/sjpdf.h...iclenum=dm0025

            http://www.stata-journal.com/sjpdf.h...lenum=dm0025_1

            Comment


            • #7
              Nothing in your recent code indicates anything about duplicating observations, which makes me wonder if duplicates are your real problem. And, by the way, your replace statement will not work, since the word -replace- must be followed by the name of an variable, not a value of some variable. (-help replace-)

              That being said: *If* you indeed are wanting to change the value of a variable for which duplicating values occur, I realized that -duplicates- is not necessary. Try this:
              Code:
              // Assume variable of interest is named "key", and that a variable named "sequence" indicates the
              // order (first, second, etc.) of observations with the same value for "key."
              bysort key (sequence) : replace key = "rev_string" if (_n > 1)

              Comment


              • #8
                Mike Lacy : I think the code in #4 should work, or rather is legal so long as there is a variable A

                But I guess that lamya is not showing us exactly what was typed.

                Comment


                • #9
                  Yes, Nick, you're right. A might actually be a varlist. I assumed too much about the mnemonic "val." - Mike

                  Comment


                  • #10
                    A B
                    string value3
                    value1 value4
                    string value5
                    value2 value6
                    I am sorry for not being precise, let's say that A is a column, the value "string" is duplicated in this column.
                    I would like to rename the first occurrence of "string" by adding this prefix "rev_"
                    Code:
                    foreach val in A {
                    replace `val' = "rev_string" if strmatch(`val' , "string")
                    continue, break
                    }
                    Last edited by lamya kejji; 18 May 2016, 11:35.

                    Comment


                    • #11
                      Originally posted by Nick Cox View Post
                      A loop here over one variable is harmless but pointless, so let's focus on


                      Code:
                      replace A = "rev_string" if strmatch(A , "string")
                      If you want that to work only in the first occurrence of the specified string, you can look for it manually by

                      Code:
                      list A if strmatch(A, "string")
                      and changing the value in just the first observation found.

                      Otherwise, see

                      http://www.stata-journal.com/sjpdf.h...iclenum=dm0025

                      http://www.stata-journal.com/sjpdf.h...lenum=dm0025_1
                      I don't want to look for it manually, because I am writing a script that should work on different files, and this variable is always duplicated in all the files, but it is not always in the same position.

                      Comment


                      • #12
                        I think these questions have already been answered.

                        You have one variable, so a loop over that variable is harmless but pointless.

                        You don't really want or need a loop over observations as you can find the first observation satisfying your condition directly

                        See #6 again.

                        As you don't want to find values manually, the method explained in the linked paper is next in line. There are two links, but the second is just a correction to the first.

                        Comment


                        • #13
                          Sorry for the delay, and thank you Nick and Mike.

                          Here is how I solved it at the end :

                          Code:
                          duplicates tag A, gen(x)
                          bys A : gen id = _n
                          replace A = A+string(id) if x ==1

                          Comment

                          Working...
                          X