Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Alternatives to generating variables using min or max as conditionals

    Hi, I understand that the min or max function cannot be served as conditions; only if would work for conditions. However, I am really struggling to make a comparison between observations with the same ID group in different rows.

    Aims:
    I want to generate a new variable that contains one score per ID. To do so, I need to compare A_year and B_year; if B_year is before A_year, then the corresponding score would be effective, until there's a new score happens later (in still in B_year). Ultimately, I need the largest score that happens before A_year and I'll keep one score only. Is it possible to generate such a variable?

    Many thanks for all your help in advance!

    Below is a sample of my data:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(ID A_year B_year score)
     1 1997 1958  1
     1 1997 1976  2
     1 1997 1977  3
     1 1997 1983  4
     2 2013 2014  1
     4 2020 2000  2
     4 2020 2020  3
     5 1968 1965  1
     5 1968 1969  2
     5 1968 1972  3
     5 1968 1976  4
     6 2003 2015  2
     6 2003 2016  4
     7 2003 2019  1
     8 2003 2009  2
    16 1963 1962  1
    16 1963 1964  2
    16 1963 1971  3
    16 1963 1976  4
    18 2001 2002  2
    18 2001 2009  3
    21 1971 1965  1
    21 1971 1969  2
    21 1971 1972  3
    21 1971 1976  4
    25 1995 2014  2
    25 1995 2019  4
    27 1966 1949  1
    27 1966 1953  2
    27 1966 1954  4
    27 1966 1960  5
    27 1966 1961  6
    27 1966 1967  8
    27 1966 1978  9
    28 1935 1940  1
    28 1935 1941  2
    28 1935 1943  3
    33 1951 1948  2
    33 1951 1952  3
    33 1951 1955  5
    33 1951 1956  6
    33 1951 1964  7
    38 1951 1945  1
    38 1951 1946  3
    38 1951 1951  4
    38 1951 1954  5
    38 1951 1955  6
    38 1951 1958  7
    38 1951 1960  8
    38 1951 1961 10
    40 2002 2000  1
    41 1985 1981  1
    41 1985 1989  2
    41 1985 1991  3
    41 1985 2003  4
    41 1985 2005  5
    41 1985 2007  7
    41 1985 2012  8
    41 1985 2014  9
    41 1985 2020 10
    49 1979 1992  1
    49 1979 2001  2
    49 1979 2002  3
    50 1986 1998  2
    50 1986 2010  3
    51 1950 1950  2
    51 1950 1951  4
    51 1950 1953  5
    51 1950 1973  6
    53 1930 1929  2
    53 1930 1931  4
    53 1930 1932  5
    54 1979 1973  2
    54 1979 1975  3
    54 1979 1980  4
    55 1976 1977  1
    57 1984 1976  2
    57 1984 1985  4
    57 1984 1997  5
    58 1974 1962  1
    58 1974 1964  2
    58 1974 1971  3
    58 1974 1976  4
    59 1999 2000  2
    59 1999 2020  3
    60 2007 1992  1
    60 2007 2001  2
    60 2007 2002  3
    63 2001 2015  2
    63 2001 2016  4
    65 2012 2013  1
    69 1959 1945  1
    69 1959 1964  2
    73 1977 1978  2
    73 1977 1979  3
    73 1977 1985  4
    73 1977 1987  5
    73 1977 1990  6
    73 1977 1995  7
    73 1977 2012  8
    end


  • #2
    Code:
    gen upper = A_year - 1
    rangestat (max) wanted = score, by(ID) interval(B_year 0 upper)
    by ID (wanted), sort: replace wanted = wanted[1]
    sort ID B_year
    -rangestat- is written by Robert Picard, Nick Cox, and Roberto Ferrer. It is available from SSC.

    The code relies on the assumption that the value of A_year is constant within groups of observations having the same value of ID. That is true in the example data, and is necessary for the question as posed to even make sense. You should verify that it is true in your full data set before using this code.
    Last edited by Clyde Schechter; 30 Nov 2022, 06:51.

    Comment


    • #3
      Originally posted by Clyde Schechter View Post
      Code:
      gen upper = A_year - 1
      rangestat (max) wanted = score, by(ID) interval(B_year 0 upper)
      by ID (wanted), sort: replace wanted = wanted[1]
      sort ID B_year
      -rangestat- is written by Robert Picard, Nick Cox, and Roberto Ferrer. It is available from SSC.

      The code relies on the assumption that the value of A_year is constant within groups of observations having the same value of ID. That is true in the example data, and is necessary for the question as posed to even make sense. You should verify that it is true in your full data set before using this code.
      Dear Clyde Schechter, thank you so much for your help!!! It works perfectly! Thank you!!!

      Comment


      • #4
        Here is an alternative solution, with the slight advantage of using only an inbuilt command:

        Code:
        bysort ID: egen wanted2 = max(cond(B_year<A_year,score,.))

        Comment


        • #5
          Originally posted by Hemanshu Kumar View Post
          Here is an alternative solution, with the slight advantage of using only an inbuilt command:

          Code:
          bysort ID: egen wanted2 = max(cond(B_year<A_year,score,.))
          Thank you very much for your help, Hemanshu Kumar! Seems to work well too!! Thank you!

          Comment


          • #6
            I am reacting belatedly to the statement

            Code:
             I understand that the min or max function cannot be served as conditions; only if would work for conditions
            as I don't follow what is meant there. Nothing stops a call to max() or min() within an if condition.

            Comment


            • #7
              Sorry, the text in #6 should have been quoted, and certainly isn't code. In the interests of readability, here it is again.

              I understand that the min or max function cannot be served as conditions; only if would work for conditions.

              Comment


              • #8
                Originally posted by Nick Cox View Post
                Sorry, the text in #6 should have been quoted, and certainly isn't code. In the interests of readability, here it is again.
                Hi Nick, I saw your post in the link below and that's why I said what you quoted. Sorry if I didn't get it right.
                Now the problem is solved. Thank you!

                https://www.statalist.org/forums/for...s-conditionals

                Comment

                Working...
                X