Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problems with referring to rows within the grouping with if-conditions

    Good day to all,

    I am relatively new to Stata and need to write a long code of if-conditions for my research. My goal is to create a new variable (PLBTt3 for example) and populate it with values that are within the group, but outside of the explicit observation.

    I'll share my code with you first, then hopefully it will be more understandable:

    bysort NAME: gen PLBTt3 = 0
    bysort NAME: gen PLBTt2 = 0
    bysort NAME: gen PLBTt1 = 0
    sort NAME year


    by NAME: replace PLBTt3= PLBT[year==1]+1000000 if (PLBT[year==1]<-1000000)&(PLBT[year==2]>1000000)

    by NAME: replace PLBTt2= PLBT[year==2]-1000000 if (PLBT[year==1]<-1000000)&(PLBT[year==2]>1000000)


    .
    .
    .

    I want to tell STATA which observation within the group I want to reference with {year==1}(e.g.). However, Stata does not recognize this "{year==1}" command correctly because, I do not get an error message, but data is "Replaced" even though the conditions are not met.

    Now to my question: Did I miss something? Is {year==1} not the "correct" command to tell STATA the correct row?



    PS. I have been looking for a solution in the forum for a long time, however I had trouble making my search command fit. I also tried to torture ChatGPT for a long time, but it was no use...

    Many thanks in advance,

    Lenny

  • #2
    Addendum: And is there also a way to specify directly in the "replace" code which observation within the grouping I want to "replace"?

    by NAME: replace PLBTt3{year==2} = PLBT[year==1]+1000000 if (PLBT[year==1]<-1000000)&(PLBT[year==2]>1000000)

    thx

    Comment


    • #3
      If the variable year always takes on the values 1, 2, 3,..., i.e. consecutive integers starting with 1 (and no skips in the series and no repetitions), then you can do this with:
      Code:
      isid NAME year, sort
      by NAME (year): gen PLBTt3 = 0
      by NAME (year): replace PLBTt3= PLBT[1]+1000000 if (PLBT[1]<-1000000)&(PLBT[2]>1000000)
      and the like.

      Addendum: And is there also a way to specify directly in the "replace" code which observation within the grouping I want to "replace"?
      Under the same requirements about the variable year, you could do this as:
      Code:
      by NAME: replace PLBTt3 = PLBT[1]+1000000 if (PLBT[1]<-1000000)&(PLBT[2]>1000000) if _n == 2
      However, Stata does not recognize this "{year==1}" command correctly because, I do not get an error message, but data is "Replaced" even though the conditions are not met.
      The code you used is legal syntax, so Stata does not give you an error message. It is legal syntax, but it doesn't mean what you think it means. So let's unpack what it means, using
      Code:
      by NAME: replace PLBTt3= PLBT[year==1]+1000000 if (PLBT[year==1]<-1000000)&(PLBT[year==2]>1000000)
      as an example. -year == 1- is a logical expression in Stata. Logical expressions in Stata evaluate to 1 if true and to 0 if false. So in any observation where year is 1, -year == 1- evaluates to 1. In any observation where year is anything but 1, -year == 1- evaluates to 0. Similarly -year == 2- evaluates to 1 in any observation where the value of year is 2, and to 0 in any other obsevation. So the term PLBT[year==1] will be understood by Stata as PLBT[1] in observations where year is 1, and as PLBT[0] where year is 0. Now, since Stata is a 1-based subscripting language, PLBT[0] is always missing value. Similarly PLBT[year == 2] will evaluate to PLBT[2] (i.e. the second value of PLBT in the NAME group) in any observation where year == 2, and to missing value in any other observation. The other important piece of information you need to remember is that in Stata, missing value is treated as greater than any real number. So missing value > 1000000 will be true.

      Putting all of that together, in an observation where year is 1, the code will replace PLBTt3 by the first observation of PLBT in the NAME group + 1000000 if the first observation of PLBT in the NAME group is less than -1000000, and will leave it unchanged otherwise. In any observation where year is not 1, the expression on the right side of the equals sign evaluates to missing value, and PLBTt3 will be replaced with missing value if the first observation of PLBT in the NAME group is less than -1000000, and will be left unchanged otherwise.

      If year does not meet the stringent requirements outlined in the first sentence of this post, then the code will need to be more complicated. I would not venture a guess what it would look like without having recourse to example data. If this is your circumstance, please post back showing example data, and use the -dataex- command to do so. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.



      When asking for help with code, always show example data. When showing example data, always use -dataex-.

      Comment


      • #4
        year == 1 is perfectly legal in Stata as an expression (*) and evaluates to 1 if true and 0 if false for the current observation. A reference to value in observation 0 always returns missing and a reference to value in observation 1 returns what it says, the value in observation 1. However, for what you perhaps want that buys you nothing that helps.

        You don't tell us what is in the variable year.

        It's possible that what you need is (for example)

        Code:
        by NAME: replace PLBTt3= PLBT[1] + 1000000 if PLBT[1]  <= 1000000 &  PLBT[2] > 1000000
        noting that

        in the framework of by: [1] refers to the first observation in each group and so on.

        <- has no meaning in Stata that I know, but is presumably a typo for <=

        You're feeding my prejudice that use of AI wastes more time here than it saves.

        (*) It's not a command, however.

        EDIT Crossed with #3. But Clyde necessarily couldn't test your code and so missed <- which I think is just a typo; otherwise the code in #1 would have failed miserably.
        Last edited by Nick Cox; 26 Apr 2023, 11:53.

        Comment


        • #5
          First of all, thank you for the quick responses from both of you. I need a little time to work through the inputs.


          @Nick: With the condition <-1000000 I want to express "is less than minus 1 million". Then as a result of the condition, either 1 million is added or subtracted.


          I was able to quickly test Clyde's first proposed code and it seems to work. As correctly assumed, "year" is numbered sequentially from 1-4 (four observations per group). All observations within the group (where the conditions matched) now have the correct value in the variable "PLBTt3". I will try to apply this scheme to all commands. Thanks a lot!

          Comment


          • #6
            Correction: something like <- 1000000 will
            indeed be interpreted as <. -1000000 but I have to suggest that the minus sign would be better placed without a following space. l

            Comment


            • #7
              Good day to all,

              I have again a small question about my if-conditions.
              With the help of your advice, I was able to make great progress in the meantime, but after extensive error analysis, I found that some of the if-sentences in the conditions "interfere" with each other. To fix this error, I need to run some of the if-sentences "simultaneously".

              Code:
               by NAME (year): replace PLBTt3 = PLBTt3+1000000 if (PLBTt3<-1000000 & PLBTt2>1000000)
              by NAME (year): replace PLBTt2 = PLBTt2-1000000 if (PLBTt3<-1000000 & PLBTt2>1000000)
              Do you have any idea how I can modify the above code to have the respective commands run simultaneously based on the same condition?


              Thanks again and have a nice Sunday!

              Comment


              • #8
                Commands run in sequence. There is no sense in which they can run simultaneously. But what is that you want?

                Note first that the by: prefix makes no difference to your results. Let's also simplify units by dividing by 1 million. So, you seek

                Code:
                replace PLBTt3 = PLBTt3 +1 if PLBTt3 < -1 & PLBTt2 > 1
                replace PLBTt2 = PLBTt2 -1 if PLBTt3 < -1 & PLBTt2 > 1
                If you explain the logic here, a code solution may be forthcoming. Perhaps you what need is a loop, or a different data structure.

                (Perhaps PLBTt3 and PLBTt2 are evocative names in people in your field who can help more.)

                Comment


                • #9
                  Thank you for your quick reply!

                  My plan is, to shift 1 Million from one Variable to another. In this specific case from PLBTt2 to PLBTt3.

                  Unfortunately the following command isn’t working, but it think it will visualise my goal.

                  Code:
                    
                   by NAME (year): replace PLBTt2 = PLBTt2 - 1000000 & replace PLBTt3 = PLBTt3 + 1000000 if (PLBTt3<-1000000 & PLBTt2>1000000)
                  Is there a code solution you can think of? I think there must be, but I wasn’t able to find one online.

                  Comment


                  • #10
                    No, you can't combine commands like that.

                    The problem sounds like adding 1 million to one and subtracting 1 million from the other. It shouldn't matter in which orde that happens.

                    Can you give a worked example with data before and data atter if that is not the answer? .

                    Comment


                    • #11
                      Actually the order does matter because due to the first code some observations do not longer fulfill the if conditions of the second code. Therefore one million is added without being substracted.
                      A possible solution would be to generate new variables for PLBTt2 and PLBTt3, which won’t be changed by the equations but exist for the required „stable“ condition problem.
                      I hope you completely understand my problem now. I think there must be a more elegant solution than generating those additional variables.

                      Again, thanks for your help!

                      Comment


                      • #12
                        Sorry, no. I can't follow this at all. You didn't give a worked example. I hope someone else can help.

                        Comment


                        • #13
                          Sorry Nick, I am not fluent in English, nor in STATA.
                          My solution presented above is not pretty, but it works. That is the most important thing.

                          Thanks a lot for your help!

                          Comment


                          • #14
                            I think what O.P. is getting at is that the application of the first command changes the value of PLBT3, so that the -if- qualifier in the second command no longer functions as he wants; he wants the second command to be based on the original value of PLBT3. Here's one way to do it:
                            Code:
                            clonevar PLBT3_original = PLBT3
                            replace PLBTt3 = PLBTt3 +1 if PLBTt3_original < -1 & PLBTt2 > 1
                            replace PLBTt2 = PLBTt2 -1 if PLBTt3_original < -1 & PLBTt2 > 1
                            drop PLBT3_original

                            Comment


                            • #15
                              Exactly! Thank you.

                              Comment

                              Working...
                              X