Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Obtaining all data entries in one most recent row per ID

    Hi all,

    First of all, here is my data:

    Code:
    input long id float(gradeCH_SF_HK1 gradeGP_EF_PP gradeGP_EF_ZK1 gradeEN_ZK)
    599518 . . 5.5 .
    599518 . 5   . .
    599518 . .   . 5
    I have several IDs, I chose to show this one for conciseness and because it represents a good example of what I would like to do.

    Right now, the dataset is as follows (it is not xtset: there is no time variable):
    id gradeCH_SF_HK1 gradeGP_EF_PP gradeGP_EF_ZK1 gradeEN_ZK
    599518 . . 5.5 .
    599518 . 5 . .
    599518 . . . 5

    I would like it to be like this:
    id gradeCH_SF_HK1 gradeGP_EF_PP gradeGP_EF_ZK1 gradeEN_ZK
    599518 . . 5.5 .
    599518 . 5 . .
    599518 . 5 5.5 5

    So that all the information by ID is in the most recent observation (row). After that, I should enable me to run
    Code:
    keep if _n==_N
    with all the data being contained in the last row.

    I naively tried running
    Code:
    bysort id: replace grade*[_N]==grade*[_n-1] if grade[_N]==. & grade*[_n-1]!=.
    but got the error "weights not allowed". In addition, I am not even sure that this is the right command to execute to obtain what I want.


    Any ideas on how to proceed?

    Many thanks!
    Maxence Morlet



  • #2
    if nonmissing values don't vary by id,
    Code:
    foreach v of var grade* {
    bys id (`v'): replace `v' = `v'[1]
    }
    Last edited by Øyvind Snilsberg; 19 Nov 2021, 05:52.

    Comment


    • #3
      Code:
      clear
      input long id float(gradeCH_SF_HK1 gradeGP_EF_PP gradeGP_EF_ZK1 gradeEN_ZK)
      599518 . . 5.5 .
      599518 . 5   . .
      599518 . .   . 5
      end
      
      bys id: gen order = _n
      gen valid_index = .
      foreach var of varlist grade* {
          replace valid_index = !missing(`var')*order
          sum valid_index, meanonly
          if r(max) > 0 {
              bys id (order): replace `var' = `var'[r(max)] if _n == _N
          }
      }
      drop valid_index
      list
      ---------------------------------
      Maarten L. Buis
      University of Konstanz
      Department of history and sociology
      box 40
      78457 Konstanz
      Germany
      http://www.maartenbuis.nl
      ---------------------------------

      Comment


      • #4
        Thank you very much for your prompt reply!

        How would your code change if nonmissing values were to vary by ID?

        Many thanks!
        Maxence

        Comment


        • #5
          Code:
           
           sort id, stable   foreach v of var grade* {     by id: replace `v' = `v'[_n-1] if missing(`v')  }  by id : keep if _n == _N

          Comment


          • #6
            Many thanks to all three of you!

            Comment


            • #7
              #5 should have been

              Code:
              sort id, stable
              
              foreach v of var grade* {
                    by id: replace `v' = `v'[_n-1] if missing(`v')
              }
              
              by id : keep if _n == _N
              But note also

              Code:
              collapse (lastnm)

              Comment

              Working...
              X