Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Assign terms to graduation dates (with multiple dates)

    Hello All,

    I am working with high school graduation and college enrollment dates. Variable is currently a string and looks like 05may2016. I want to assign terms to each graduation and enrollment date but have thousands of entries. For example, if student graduated May 25, 2016 or June 3, 2016 it would be Spring 2016 graduation term. If student enrolled September 7, 2016 or September 23, 2016 it would be Fall 2016 enrollment date. I am not sure what stata code or steps are needed to complete this.

    I am also trying to match high school graduation term with enrollment term and look at the length of time between each.



  • #2
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str9 event_date
    "05may2016"
    "3jun2016"
    "7sep2016"
    "23sep2016"
    end
    
    gen date = daily(event_date, "DMY")
    gen year = year(date)
    gen quarter = quarter(date)
    gen term = cond(quarter <= 2, "Spring ", "Fall ") + string(year)
    This code assumes that everything in the first two quarters of the year is the Spring semester, and everything in the last two is the Fall semester. If there is also a Summer term, or if the dividing line between Spring and Fall is drawn differently, then the code would need to be modified accordingly. It might involve looking at -month(date)- rather than -quarter(date)-.

    In the future, when asking for help with code, please show example data. And use the -dataex- command to do so, as I have here. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    When asking for help with code, always show example data. When showing example data, always use -dataex-.
    Last edited by Clyde Schechter; 20 Mar 2023, 16:05.

    Comment


    • #3
      Clyde Schechter Thank you for the information, stata is fairly new to me. This is what results with dataex

      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input long high_school_grad_date
      20248
      20248
      20248
      20248
      end
      format %td high_school_grad_date
      I am working with a long data set that has multiple rows per student ID

      Comment


      • #4
        You misspoke in #1 when you said your date variable was a string. It isn't! That's why -dataex- is so important: it reveals these details that are often essential for getting the code right.

        Since your date variable actually isn't a string, the first line of code I suggested is not needed. All you need is:
        Code:
        gen year = year(high_school_grad_date)
        gen quarter = quarter(high_school_grad_date)
        gen term = cond(quarter <= 2, "Spring ", "Fall ") + string(year)

        Comment


        • #5
          Clyde Schechter

          Thanks for the clarification. From your earlier post, there is a difference between Spring, Summer and Fall. I am trying to differentiate by month vs quarter.
          Last edited by Alejandra Gomez; 20 Mar 2023, 16:55.

          Comment


          • #6
            OK. Let me assume that January through June are Spring, July and August, Summer, and September through December, Fall.
            Code:
            gen year = year(high_school_grad_date)
            gen month = month(high_school_grad_date)
            gen term = "Spring " if month <= 6
            replace term = "Summer " if inlist(month, 7, 8)
            replace term = "Fall " if month >= 9
            replace term = term + string(year)

            Comment


            • #7
              Clyde Schechter Thank you so much for this code! It worked great.

              I have another variable that I am trying to use the same code on. I would like degree_date to match graduation_date to be able to perform the same code. The code I am using is

              Code:
              foreach var of varlist degree_date {
              replace `var' = date(string(`var' ,"%08.0f"),"YMD")
              format %td `var' 
              }
              It is returning some and marking others as missing, where there is data. I am not sure where my mistake is occurring or why it is generating missing.

              This is the example for degree_date

              Code:
              * Example generated by -dataex-. For more info, type help dataex
              clear
              input float degree_date
                     .
                     .
                     .
                     .
                     .
                     .
                     .
                     .
              20200612
              20200612
              20200612
              20200612
              20200612
              20200612
              20200612
              20200612
              20200612
              20200612
              20200612
              20200612
              20200612
              20200612
              end
              This is the example for graduation date

              Code:
              * Example generated by -dataex-. For more info, type help dataex
              clear
              input long graduation_date
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
              22078
              22078
                  .
                  .
                  .
                  .
                  .
                  .
              22426
              22426
                  .
              21168
              21168
              21897
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
              21710
              21715
                  .
                  .
                  .
                  .
                  .
              22261
              22280
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
              21691
              21686
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
                  .
              end
              format %td graduation_date

              Comment


              • #8
                Well the example data you show does not exhibit the problem you describe. Running your code with that data, the only missing values generated correspond to missing values of the original degree_date variable.

                I do have a vague hunch what may be the problem in the larger data set. I notice that you used the display format -%08.0f- in your invocation of the -string()- function. Now, if every entry in the degree_date variable is truly an 8 digit number, YYYYMMDD, then that leading 0 should be unnecessary. Everything will automatically be 8 digits long (as I'm sure you have no degree_dates going back to the year 999 or earlier). So the request fort 0-padding should be unnecessary. But if you do have a date, where a leading 0 on the month or day has been omitted. Say 1/1/2020 is coded as 202011 or something like that, then indeed -daily()- will not be able to parse that and generate a valid date.

                Another possibility is that some of the dates have been miscoded, entered as YYYYDDMM. When Stata then encounters 20202506, it is not going to think "Oh, this is an exception to the "YMD" instruction I was given." It is going to simply say "this is not a valid YMD date" and return a missing value.

                To find and inspect the values of degree_date that are causing the problem, you can run:
                Code:
                gen probe = daily(string(degree_date, "%08.0f"), "YMD")
                browse degree_date if missing(probe) & !missing(degree_date)
                Note that both of the above are potentially very serious data problems. If the problem is 0 dropping, then if we have something like 2020123 as the invalid form, there is no way to know if this is supposed to be 3 December 2020 or 23 January 2020. And if the problem is a date misentered in YDM format, then you have to worry that some other entry that was accepted by Stata, say 20200605, although interpreted as YMD (5 June 2020), might actually have also been a miscode for 6 May 2020. In fact this would cast doubt on all of the observations that Stata accepted as YMD if their day of the month is in the 1 through 12 range.

                Those are the two most likely explanations. If neither of them is truly the case in your data, then you definitely need to post back with example data that produces the problem so that I can try to troubleshoot.
                Last edited by Clyde Schechter; 23 Mar 2023, 14:57.

                Comment


                • #9
                  Clyde Schechter thanks for your help.

                  My data set contains college enrollment data by term, multiple instances per student ID. If a student enrolled 10 terms in college, there will be 10 date rows per ID. Graduation term date is in a separate column (graduation_date). The graduated variable is marked N or Y if a student graduated college. I created a variable (degree) to note N = 0 and Y = 1. Then ran code
                  Code:
                  bys your_unique_identifier: egen college_grad=max(degree)
                  to create college_grad and carry over that 1 to each observation by student ID.

                  I want to now carry over graduation_date by each observation where college_grad is 1. I did this with code
                  Code:
                  bys your_unique_identifier: egen degree_date=max(graduation_date)
                  . It generates 15,150 missing values (dataset contains 27,384 observations)

                  I tried removing the leading 0 from
                  Code:
                  foreach var of varlist degree_date {
                  replace `var' = date(string(`var' ,"%8.0f"),"YMD")
                  format %td `var'
                  }
                  and it did generate observations but I am missing 1,265 from my variable. ID # 1111001794 is an example below. This student has a Y for graduated and a graduation date but it did not return a date when I used this code.

                  Code:
                  * Example generated by -dataex-. For more info, type help dataex
                  clear
                  input str11 your_unique_identifier str1 graduated long graduation_date float(degree college_grad degree_date)
                  "1111001546_" "N"        . 0 0     .
                  "1111001615_" "N"        . 0 0     .
                  "1111001615_" "N"        . 0 0     .
                  "1111001615_" "N"        . 0 0     .
                  "1111001615_" "N"        . 0 0     .
                  "1111001615_" "N"        . 0 0     .
                  "1111001776_" "N"        . 0 0     .
                  "1111001776_" "N"        . 0 0     .
                  "1111001780_" "N"        . 0 1 22078
                  "1111001780_" "N"        . 0 1 22078
                  "1111001780_" "N"        . 0 1 22078
                  "1111001780_" "N"        . 0 1 22078
                  "1111001780_" "N"        . 0 1 22078
                  "1111001780_" "N"        . 0 1 22078
                  "1111001780_" "N"        . 0 1 22078
                  "1111001780_" "N"        . 0 1 22078
                  "1111001780_" "N"        . 0 1 22078
                  "1111001780_" "Y" 20200612 1 1 22078
                  "1111001780_" "Y" 20200612 1 1 22078
                  "1111001780_" "N"        . 0 1 22078
                  "1111001780_" "N"        . 0 1 22078
                  "1111001780_" "N"        . 0 1 22078
                  "1111001780_" "N"        . 0 1 22078
                  "1111001780_" "N"        . 0 1 22078
                  "1111001780_" "N"        . 0 1 22078
                  "1111001785_" "Y" 20210526 1 1 22426
                  "1111001785_" "Y" 20210526 1 1 22426
                  "1111001787_" "N"        . 0 1 21169
                  "1111001787_" "Y" 20171215 1 1 21169
                  "1111001787_" "Y" 20171215 1 1 21169
                  "1111001789_" "Y" 20191214 1 1 21897
                  "1111001790_" "N"        . 0 0     .
                  "1111001791_" "N"        . 0 0     .
                  "1111001791_" "N"        . 0 0     .
                  "1111001791_" "N"        . 0 0     .
                  "1111001791_" "N"        . 0 0     .
                  "1111001791_" "N"        . 0 0     .
                  "1111001794_" "N"        . 0 1     .
                  "1111001794_" "N"        . 0 1     .
                  "1111001794_" "N"        . 0 1     .
                  "1111001794_" "N"        . 0 1     .
                  "1111001794_" "N"        . 0 1     .
                  "1111001794_" "N"        . 0 1     .
                  "1111001794_" "N"        . 0 1     .
                  "1111001794_" "N"        . 0 1     .
                  "1111001794_" "N"        . 0 1     .
                  "1111001794_" "N"        . 0 1     .
                  "1111001794_" "N"        . 0 1     .
                  "1111001794_" "N"        . 0 1     .
                  "1111001794_" "Y" 20190610 1 1     .
                  "1111001794_" "Y" 20190615 1 1     .
                  "1111001794_" "N"        . 0 1     .
                  "1111001794_" "N"        . 0 1     .
                  "1111001794_" "N"        . 0 1     .
                  "1111001794_" "N"        . 0 1     .
                  "1111001794_" "N"        . 0 1     .
                  "1111001794_" "Y" 20201212 1 1     .
                  "1111001794_" "Y" 20201231 1 1     .
                  "1111001800_" "N"        . 0 1 21691
                  "1111001800_" "N"        . 0 1 21691
                  "1111001800_" "N"        . 0 1 21691
                  "1111001800_" "N"        . 0 1 21691
                  "1111001800_" "N"        . 0 1 21691
                  "1111001800_" "N"        . 0 1 21691
                  "1111001800_" "N"        . 0 1 21691
                  "1111001800_" "N"        . 0 1 21691
                  "1111001800_" "N"        . 0 1 21691
                  "1111001800_" "N"        . 0 1 21691
                  "1111001800_" "Y" 20190522 1 1 21691
                  "1111001800_" "Y" 20190517 1 1 21691
                  "1111001800_" "N"        . 0 1 21691
                  "1111001804_" "N"        . 0 0     .
                  "1111001804_" "N"        . 0 0     .
                  "1111001804_" "N"        . 0 0     .
                  "1111001804_" "N"        . 0 0     .
                  "1111001804_" "N"        . 0 0     .
                  "1111001804_" "N"        . 0 0     .
                  "1111001804_" "N"        . 0 0     .
                  "1111001804_" "N"        . 0 0     .
                  "1111001804_" "N"        . 0 0     .
                  "1111001804_" "N"        . 0 0     .
                  "1111001804_" "N"        . 0 0     .
                  "1111001804_" "N"        . 0 0     .
                  "1111001804_" "N"        . 0 0     .
                  "1111001804_" "N"        . 0 0     .
                  "1111001804_" "N"        . 0 0     .
                  "1111001804_" "N"        . 0 0     .
                  "1111001804_" "N"        . 0 0     .
                  "1111001804_" "N"        . 0 0     .
                  "1111001804_" "N"        . 0 0     .
                  "1111001804_" "N"        . 0 0     .
                  "1111001804_" "N"        . 0 0     .
                  "1111001804_" "N"        . 0 0     .
                  "1111001804_" "N"        . 0 0     .
                  "1111001804_" "N"        . 0 0     .
                  "1111001804_" "N"        . 0 0     .
                  "1111001804_" "N"        . 0 0     .
                  "1111001804_" "N"        . 0 0     .
                  "1111001804_" "N"        . 0 0     .
                  "1111001804_" "N"        . 0 0     .
                  end
                  format %td degree_date

                  Comment


                  • #10
                    Hmm! I have to admit that I fail to see what is going wrong here. I can reproduce your result. But I do not see why the code does not produce the desired result. For some reason the command that replaces degree_date inside the loop does not convert all of the values. Now, the loop is unnecessary since you are only iterating it once: your varlist contains only one variable. But even stripping away the loop and trying again, the same values of degree_date go, mysteriously, unconverted.

                    Sometimes when one is stuck on code that is failing and can't see the problem after a concerted effort, it is better to delete it and start writing from scratch. Sometimes even if you retype what you believe is exactly the same code, it then works. In this case, given the lack of necessity for the loop, I didn't rewrite exactly the same code. Anyway, the following does work (and the last line verifies that every college graduate gets a value for degree_date:
                    Code:
                    * Example generated by -dataex-. For more info, type help dataex
                    clear
                    input str11 your_unique_identifier str1 graduated long graduation_date float degree
                    "1111001546_" "N"        . 0
                    "1111001615_" "N"        . 0
                    "1111001615_" "N"        . 0
                    "1111001615_" "N"        . 0
                    "1111001615_" "N"        . 0
                    "1111001615_" "N"        . 0
                    "1111001776_" "N"        . 0
                    "1111001776_" "N"        . 0
                    "1111001780_" "N"        . 0
                    "1111001780_" "N"        . 0
                    "1111001780_" "N"        . 0
                    "1111001780_" "N"        . 0
                    "1111001780_" "N"        . 0
                    "1111001780_" "N"        . 0
                    "1111001780_" "N"        . 0
                    "1111001780_" "N"        . 0
                    "1111001780_" "N"        . 0
                    "1111001780_" "Y" 20200612 1
                    "1111001780_" "Y" 20200612 1
                    "1111001780_" "N"        . 0
                    "1111001780_" "N"        . 0
                    "1111001780_" "N"        . 0
                    "1111001780_" "N"        . 0
                    "1111001780_" "N"        . 0
                    "1111001780_" "N"        . 0
                    "1111001785_" "Y" 20210526 1
                    "1111001785_" "Y" 20210526 1
                    "1111001787_" "N"        . 0
                    "1111001787_" "Y" 20171215 1
                    "1111001787_" "Y" 20171215 1
                    "1111001789_" "Y" 20191214 1
                    "1111001790_" "N"        . 0
                    "1111001791_" "N"        . 0
                    "1111001791_" "N"        . 0
                    "1111001791_" "N"        . 0
                    "1111001791_" "N"        . 0
                    "1111001791_" "N"        . 0
                    "1111001794_" "N"        . 0
                    "1111001794_" "N"        . 0
                    "1111001794_" "N"        . 0
                    "1111001794_" "N"        . 0
                    "1111001794_" "N"        . 0
                    "1111001794_" "N"        . 0
                    "1111001794_" "N"        . 0
                    "1111001794_" "N"        . 0
                    "1111001794_" "N"        . 0
                    "1111001794_" "N"        . 0
                    "1111001794_" "N"        . 0
                    "1111001794_" "N"        . 0
                    "1111001794_" "Y" 20190610 1
                    "1111001794_" "Y" 20190615 1
                    "1111001794_" "N"        . 0
                    "1111001794_" "N"        . 0
                    "1111001794_" "N"        . 0
                    "1111001794_" "N"        . 0
                    "1111001794_" "N"        . 0
                    "1111001794_" "Y" 20201212 1
                    "1111001794_" "Y" 20201231 1
                    "1111001800_" "N"        . 0
                    "1111001800_" "N"        . 0
                    "1111001800_" "N"        . 0
                    "1111001800_" "N"        . 0
                    "1111001800_" "N"        . 0
                    "1111001800_" "N"        . 0
                    "1111001800_" "N"        . 0
                    "1111001800_" "N"        . 0
                    "1111001800_" "N"        . 0
                    "1111001800_" "N"        . 0
                    "1111001800_" "Y" 20190522 1
                    "1111001800_" "Y" 20190517 1
                    "1111001800_" "N"        . 0
                    "1111001804_" "N"        . 0
                    "1111001804_" "N"        . 0
                    "1111001804_" "N"        . 0
                    "1111001804_" "N"        . 0
                    "1111001804_" "N"        . 0
                    "1111001804_" "N"        . 0
                    "1111001804_" "N"        . 0
                    "1111001804_" "N"        . 0
                    "1111001804_" "N"        . 0
                    "1111001804_" "N"        . 0
                    "1111001804_" "N"        . 0
                    "1111001804_" "N"        . 0
                    "1111001804_" "N"        . 0
                    "1111001804_" "N"        . 0
                    "1111001804_" "N"        . 0
                    "1111001804_" "N"        . 0
                    "1111001804_" "N"        . 0
                    "1111001804_" "N"        . 0
                    "1111001804_" "N"        . 0
                    "1111001804_" "N"        . 0
                    "1111001804_" "N"        . 0
                    "1111001804_" "N"        . 0
                    "1111001804_" "N"        . 0
                    "1111001804_" "N"        . 0
                    "1111001804_" "N"        . 0
                    "1111001804_" "N"        . 0
                    "1111001804_" "N"        . 0
                    "1111001804_" "N"        . 0
                    "1111001804_" "N"        . 0
                    end
                    
                    by your_unique_identifier, sort: egen college_grad = max(degree)
                    gen graduation_date_sif = date(string(graduation_date, "%8.0f"), "YMD")
                    format graduation_date_sif %td
                    drop graduation_date
                    by your_unique_identifier: egen degree_date = max(graduation_date_sif)
                    format degree_date %td
                    assert !missing(degree_date) if college_grad
                    N.B. The _sif suffix I used stands for Stata internal format.

                    Comment


                    • #11
                      Clyde Schechter Thank you so much! This worked perfectly! I could not figure out what was going wrong and was just stuck on that code. The loop works for my other variables, thank you again. Being stuck on code is frustrating and you have helped out a great deal.

                      Comment


                      • #12
                        Clyde Schechter This has been super helpful. I have a follow up, I am trying to recreate this same code on a second spreadsheet and upon browsing my data, notice that graduation_date_sif for some instances do not match degree_date, it doesn't pull the same data. If the graduation date is in 2018 or 2019, it will generate a 2022 degree date, throwing off my data. Would you be able to help troubleshoot? I notice the first instance in row 221, which is larger than the dataex shows

                        Code:
                        * Example generated by -dataex-. For more info, type help dataex
                        clear
                        input double randomid str1 graduated long graduation_date float(degree college_grad graduation_date_sif degree_date)
                        2103300231 "N"        . 0 0     .     .
                        2103300237 "N"        . 0 0     .     .
                        2103300243 "N"        . 0 0     .     .
                        2103300255 "N"        . 0 0     .     .
                        2103300255 "N"        . 0 0     .     .
                        2103300255 "N"        . 0 0     .     .
                        2103300255 "N"        . 0 0     .     .
                        2103300255 "N"        . 0 0     .     .
                        2103300255 "N"        . 0 0     .     .
                        2103300255 "N"        . 0 0     .     .
                        2103300264 "N"        . 0 0     .     .
                        2103300264 "N"        . 0 0     .     .
                        2103300264 "N"        . 0 0     .     .
                        2103300264 "N"        . 0 0     .     .
                        2103300264 "N"        . 0 0     .     .
                        2103300264 "N"        . 0 0     .     .
                        2103300264 "N"        . 0 0     .     .
                        2103300267 "N"        . 0 0     .     .
                        2103300267 "N"        . 0 0     .     .
                        2103300267 "N"        . 0 0     .     .
                        2103300267 "N"        . 0 0     .     .
                        2103300267 "N"        . 0 0     .     .
                        2103300267 "N"        . 0 0     .     .
                        2103300267 "N"        . 0 0     .     .
                        2103300267 "N"        . 0 0     .     .
                        2103300282 "N"        . 0 0     .     .
                        2103300291 "N"        . 0 0     .     .
                        2103300321 "N"        . 0 0     .     .
                        2103300336 "N"        . 0 0     .     .
                        2103300336 "N"        . 0 0     .     .
                        2103300426 "N"        . 0 0     .     .
                        2103300477 "N"        . 0 0     .     .
                        2103300477 "N"        . 0 0     .     .
                        2103300477 "N"        . 0 0     .     .
                        2103300477 "N"        . 0 0     .     .
                        2103300477 "N"        . 0 0     .     .
                        2103300477 "N"        . 0 0     .     .
                        2103300477 "N"        . 0 0     .     .
                        2103300477 "N"        . 0 0     .     .
                        2103300477 "N"        . 0 0     .     .
                        2103300477 "N"        . 0 0     .     .
                        2103300477 "N"        . 0 0     .     .
                        2103300477 "N"        . 0 0     .     .
                        2103300477 "N"        . 0 0     .     .
                        2103300477 "N"        . 0 0     .     .
                        2103300477 "N"        . 0 0     .     .
                        2103300495 "N"        . 0 0     .     .
                        2103300495 "N"        . 0 0     .     .
                        2103300552 "N"        . 0 0     .     .
                        2103300552 "N"        . 0 0     .     .
                        2103300552 "N"        . 0 0     .     .
                        2103300552 "N"        . 0 0     .     .
                        2103300552 "N"        . 0 0     .     .
                        2103300552 "N"        . 0 0     .     .
                        2103300552 "N"        . 0 0     .     .
                        2103300552 "N"        . 0 0     .     .
                        2103300552 "N"        . 0 0     .     .
                        2103300552 "N"        . 0 0     .     .
                        2103300561 "N"        . 0 0     .     .
                        2103300561 "N"        . 0 0     .     .
                        2103300561 "N"        . 0 0     .     .
                        2103300561 "N"        . 0 0     .     .
                        2103300561 "N"        . 0 0     .     .
                        2103300561 "N"        . 0 0     .     .
                        2103300585 "N"        . 0 0     .     .
                        2103300585 "N"        . 0 0     .     .
                        2103300585 "N"        . 0 0     .     .
                        2103300585 "N"        . 0 0     .     .
                        2103300591 "N"        . 0 0     .     .
                        2103300594 "N"        . 0 0     .     .
                        2103300600 "N"        . 0 0     .     .
                        2103300603 "N"        . 0 0     .     .
                        2103300855 "N"        . 0 0     .     .
                        2103300861 "N"        . 0 0     .     .
                        2103300861 "N"        . 0 0     .     .
                        2103300861 "N"        . 0 0     .     .
                        2103300861 "N"        . 0 0     .     .
                        2103300861 "N"        . 0 0     .     .
                        2103300861 "N"        . 0 0     .     .
                        2103300861 "N"        . 0 0     .     .
                        2103300861 "N"        . 0 0     .     .
                        3333004638 "N"        . 0 0     .     .
                        3333004845 "N"        . 0 0     .     .
                        3333004845 "N"        . 0 0     .     .
                        3333004845 "N"        . 0 0     .     .
                        3333004845 "N"        . 0 0     .     .
                        3333004845 "N"        . 0 0     .     .
                        3333005328 "N"        . 0 0     .     .
                        3333005328 "N"        . 0 0     .     .
                        3333005340 "Y" 20200612 1 1 22078 22078
                        3333005340 "Y" 20200612 1 1 22078 22078
                        3333005340 "N"        . 0 1     . 22078
                        3333005340 "N"        . 0 1     . 22078
                        3333005340 "N"        . 0 1     . 22078
                        3333005340 "N"        . 0 1     . 22078
                        3333005340 "N"        . 0 1     . 22078
                        3333005340 "N"        . 0 1     . 22078
                        3333005340 "N"        . 0 1     . 22078
                        3333005340 "N"        . 0 1     . 22078
                        3333005340 "N"        . 0 1     . 22078
                        end
                        format %td graduation_date_sif
                        format %td degree_date

                        Comment


                        • #13
                          -dataex- allows the use of -if- qualifiers and -in- qualifiers. Please post back with a new data example that includes observations that exhibit the problem. I can't troubleshoot code that appears to work fine with the example data given. Also show the exact code you are using with it.

                          Comment


                          • #14
                            Clyde Schechter randomid 3333005442 below is an example

                            Code:
                            * Example generated by -dataex-. For more info, type help dataex
                            clear
                            input double randomid str1 graduated long graduation_date float(degree college_grad graduation_date_sif degree_date)
                            3333005415 "N"     . 0 0     .     .
                            3333005415 "N"     . 0 0     .     .
                            3333005415 "N"     . 0 0     .     .
                            3333005415 "N"     . 0 0     .     .
                            3333005415 "N"     . 0 0     .     .
                            3333005415 "N"     . 0 0     .     .
                            3333005415 "N"     . 0 0     .     .
                            3333005415 "N"     . 0 0     .     .
                            3333005415 "N"     . 0 0     .     .
                            3333005415 "N"     . 0 0     .     .
                            3333005415 "N"     . 0 0     .     .
                            3333005415 "N"     . 0 0     .     .
                            3333005415 "N"     . 0 0     .     .
                            3333005418 "N"     . 0 0     .     .
                            3333005418 "N"     . 0 0     .     .
                            3333005418 "N"     . 0 0     .     .
                            3333005418 "N"     . 0 0     .     .
                            3333005418 "N"     . 0 0     .     .
                            3333005418 "N"     . 0 0     .     .
                            3333005418 "N"     . 0 0     .     .
                            3333005418 "N"     . 0 0     .     .
                            3333005442 "Y" 21328 1 1 21328 22786
                            3333005442 "N"     . 0 1     . 22786
                            3333005442 "N"     . 0 1     . 22786
                            3333005442 "Y" 21328 1 1 21328 22786
                            3333005442 "N"     . 0 1     . 22786
                            3333005442 "N"     . 0 1     . 22786
                            3333005442 "N"     . 0 1     . 22786
                            3333005442 "N"     . 0 1     . 22786
                            3333005442 "N"     . 0 1     . 22786
                            3333005442 "N"     . 0 1     . 22786
                            3333005442 "Y" 22786 1 1 22786 22786
                            3333005442 "N"     . 0 1     . 22786
                            3333005442 "N"     . 0 1     . 22786
                            3333005442 "N"     . 0 1     . 22786
                            3333005442 "N"     . 0 1     . 22786
                            3333005442 "N"     . 0 1     . 22786
                            3333005442 "N"     . 0 1     . 22786
                            3333005442 "N"     . 0 1     . 22786
                            3333005442 "Y" 22074 1 1 22074 22786
                            3333005442 "N"     . 0 1     . 22786
                            3333005442 "N"     . 0 1     . 22786
                            3333005442 "Y" 22778 1 1 22778 22786
                            3333005442 "Y" 22079 1 1 22079 22786
                            3333005442 "Y" 21328 1 1 21328 22786
                            3333005445 "N"     . 0 1     . 22888
                            3333005445 "Y" 21914 1 1 21914 22888
                            3333005445 "N"     . 0 1     . 22888
                            3333005445 "N"     . 0 1     . 22888
                            3333005445 "N"     . 0 1     . 22888
                            3333005445 "N"     . 0 1     . 22888
                            3333005445 "Y" 21168 1 1 21168 22888
                            3333005445 "N"     . 0 1     . 22888
                            3333005445 "N"     . 0 1     . 22888
                            3333005445 "N"     . 0 1     . 22888
                            3333005445 "Y" 21168 1 1 21168 22888
                            3333005445 "N"     . 0 1     . 22888
                            3333005445 "N"     . 0 1     . 22888
                            3333005445 "N"     . 0 1     . 22888
                            3333005445 "Y" 22870 1 1 22870 22888
                            3333005445 "N"     . 0 1     . 22888
                            3333005445 "N"     . 0 1     . 22888
                            3333005445 "N"     . 0 1     . 22888
                            3333005445 "N"     . 0 1     . 22888
                            3333005445 "Y" 21168 1 1 21168 22888
                            3333005445 "N"     . 0 1     . 22888
                            3333005445 "Y" 22888 1 1 22888 22888
                            3333005445 "N"     . 0 1     . 22888
                            3333005445 "N"     . 0 1     . 22888
                            3333005448 "N"     . 0 0     .     .
                            3333005448 "N"     . 0 0     .     .
                            3333005448 "N"     . 0 0     .     .
                            3333005448 "N"     . 0 0     .     .
                            3333005448 "N"     . 0 0     .     .
                            3333005448 "N"     . 0 0     .     .
                            3333005448 "N"     . 0 0     .     .
                            3333005448 "N"     . 0 0     .     .
                            3333005451 "N"     . 0 0     .     .
                            3333005460 "N"     . 0 0     .     .
                            3333005475 "N"     . 0 0     .     .
                            3333005475 "N"     . 0 0     .     .
                            3333005475 "N"     . 0 0     .     .
                            3333005475 "N"     . 0 0     .     .
                            3333005475 "N"     . 0 0     .     .
                            3333005478 "Y" 22043 1 1 22043 22043
                            3333005478 "N"     . 0 1     . 22043
                            3333005478 "N"     . 0 1     . 22043
                            3333005478 "Y" 22043 1 1 22043 22043
                            3333005478 "N"     . 0 1     . 22043
                            3333005478 "N"     . 0 1     . 22043
                            3333005478 "N"     . 0 1     . 22043
                            3333005478 "N"     . 0 1     . 22043
                            3333005478 "N"     . 0 1     . 22043
                            3333005478 "N"     . 0 1     . 22043
                            3333005478 "N"     . 0 1     . 22043
                            3333005478 "N"     . 0 1     . 22043
                            3333005478 "N"     . 0 1     . 22043
                            3333005478 "N"     . 0 1     . 22043
                            3333005478 "N"     . 0 1     . 22043
                            3333005478 "Y" 20965 1 1 20965 22043
                            end
                            format %td graduation_date
                            format %td graduation_date_sif
                            format %td degree_date
                            Another set of observations

                            Code:
                            * Example generated by -dataex-. For more info, type help dataex
                            clear
                            input double randomid str1 graduated long graduation_date float(degree college_grad graduation_date_sif degree_date)
                            3333005481 "N"     . 0 0     .     .
                            3333005481 "N"     . 0 0     .     .
                            3333005514 "N"     . 0 0     .     .
                            3333005523 "N"     . 0 1     . 22869
                            3333005523 "N"     . 0 1     . 22869
                            3333005523 "N"     . 0 1     . 22869
                            3333005523 "N"     . 0 1     . 22869
                            3333005523 "N"     . 0 1     . 22869
                            3333005523 "N"     . 0 1     . 22869
                            3333005523 "Y" 22867 1 1 22867 22869
                            3333005523 "N"     . 0 1     . 22869
                            3333005523 "N"     . 0 1     . 22869
                            3333005523 "N"     . 0 1     . 22869
                            3333005523 "N"     . 0 1     . 22869
                            3333005523 "Y" 22869 1 1 22869 22869
                            3333005523 "N"     . 0 1     . 22869
                            3333005523 "N"     . 0 1     . 22869
                            3333005523 "N"     . 0 1     . 22869
                            3333005523 "N"     . 0 1     . 22869
                            3333005523 "N"     . 0 1     . 22869
                            3333005523 "N"     . 0 1     . 22869
                            3333005523 "N"     . 0 1     . 22869
                            3333005523 "N"     . 0 1     . 22869
                            3333005523 "N"     . 0 1     . 22869
                            3333005523 "N"     . 0 1     . 22869
                            3333005523 "N"     . 0 1     . 22869
                            3333005523 "Y" 21692 1 1 21692 22869
                            3333005523 "N"     . 0 1     . 22869
                            3333005523 "N"     . 0 1     . 22869
                            3333005574 "N"     . 0 0     .     .
                            3333005613 "N"     . 0 0     .     .
                            3333005631 "N"     . 0 0     .     .
                            3333005631 "N"     . 0 0     .     .
                            3333005631 "N"     . 0 0     .     .
                            3333005631 "N"     . 0 0     .     .
                            3333005631 "N"     . 0 0     .     .
                            3333005631 "N"     . 0 0     .     .
                            3333005631 "N"     . 0 0     .     .
                            3333005631 "N"     . 0 0     .     .
                            3333005631 "N"     . 0 0     .     .
                            3333005631 "N"     . 0 0     .     .
                            3333005637 "Y" 22044 1 1 22044 22044
                            3333005637 "N"     . 0 1     . 22044
                            3333005637 "N"     . 0 1     . 22044
                            3333005637 "N"     . 0 1     . 22044
                            3333005637 "N"     . 0 1     . 22044
                            3333005637 "N"     . 0 1     . 22044
                            3333005637 "N"     . 0 1     . 22044
                            3333005637 "N"     . 0 1     . 22044
                            3333005637 "N"     . 0 1     . 22044
                            3333005637 "N"     . 0 1     . 22044
                            3333005637 "N"     . 0 1     . 22044
                            3333005637 "N"     . 0 1     . 22044
                            3333005637 "Y" 22043 1 1 22043 22044
                            3333005637 "N"     . 0 1     . 22044
                            3333005637 "N"     . 0 1     . 22044
                            3333005637 "N"     . 0 1     . 22044
                            3333005637 "N"     . 0 1     . 22044
                            3333005655 "N"     . 0 0     .     .
                            3333005655 "N"     . 0 0     .     .
                            3333005664 "N"     . 0 0     .     .
                            3333005664 "N"     . 0 0     .     .
                            3333005664 "N"     . 0 0     .     .
                            3333005664 "N"     . 0 0     .     .
                            3333005664 "N"     . 0 0     .     .
                            3333005664 "N"     . 0 0     .     .
                            3333005664 "N"     . 0 0     .     .
                            3333005781 "N"     . 0 1     . 22049
                            3333005781 "N"     . 0 1     . 22049
                            3333005781 "N"     . 0 1     . 22049
                            3333005781 "N"     . 0 1     . 22049
                            3333005781 "N"     . 0 1     . 22049
                            3333005781 "N"     . 0 1     . 22049
                            3333005781 "N"     . 0 1     . 22049
                            3333005781 "Y" 22049 1 1 22049 22049
                            3333005781 "N"     . 0 1     . 22049
                            3333005781 "N"     . 0 1     . 22049
                            3333005781 "N"     . 0 1     . 22049
                            3333005781 "Y" 21771 1 1 21771 22049
                            3333005781 "Y" 22049 1 1 22049 22049
                            3333005781 "N"     . 0 1     . 22049
                            3333005817 "N"     . 0 0     .     .
                            3333005817 "N"     . 0 0     .     .
                            3333005817 "N"     . 0 0     .     .
                            3333005817 "N"     . 0 0     .     .
                            3333005817 "N"     . 0 0     .     .
                            3333005817 "N"     . 0 0     .     .
                            3333006099 "N"     . 0 1     . 22280
                            3333006099 "N"     . 0 1     . 22280
                            3333006099 "N"     . 0 1     . 22280
                            3333006099 "Y" 21342 1 1 21342 22280
                            3333006099 "Y" 22280 1 1 22280 22280
                            3333006099 "N"     . 0 1     . 22280
                            3333006099 "N"     . 0 1     . 22280
                            3333006099 "N"     . 0 1     . 22280
                            3333006099 "N"     . 0 1     . 22280
                            3333006099 "N"     . 0 1     . 22280
                            3333006099 "N"     . 0 1     . 22280
                            3333006099 "N"     . 0 1     . 22280
                            3333006099 "N"     . 0 1     . 22280
                            end
                            format %td graduation_date
                            format %td graduation_date_sif
                            format %td degree_date
                            Last edited by Alejandra Gomez; 26 Mar 2023, 17:30.

                            Comment


                            • #15
                              I don't see the problem in this data. In #1 you said that if there are multiple values of graduation_date, use the latest. And that is what you have here. To use the example of randomid 3333005442, that student has 5 graduation dates: 24may2018, 8jun2020, 13jun2020, 13may2022, and 21may2022. Of those, the latest is 21may2022, and that is the one value shown in degree_date. If this is not what you wanted, then you need to more clearly explain what you do need. In addition to explaining in words, it would be a good idea if you would show exactly what you want the result to be for 3333005442.
                              Last edited by Clyde Schechter; 26 Mar 2023, 20:43.

                              Comment

                              Working...
                              X