Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help with matching exam units

    Hi,
    I am relatively new to Stata and would appreciate some help writing a few lines of code to solve a problem with my data.
    I have a database of students' grades from many different schools.

    Each observation has the following relevant variables:
    • School_id
    • Student_id
    • Year
    • Subject_id
    • Exam_id
    • Exam_units (1-5)
    • New_exam (binary variable, 1 if the Exam_id is new in the given school, and didn't feature in previous years)
    • Grade
    Here is a small sample of my data:
    Exam_id subject School_id year Exam_units grade New_exam Student_id
    911551 history 148056 2001 23.7 0 93
    35103 math 440248 1997 1 1 45
    2211 math 140186 1998 2 17.80488 0 7
    43002 biology 260307 2002 -0.89474 0 34
    913061 history 270298 1995 1 97
    35103 math 260398 1999 1 -4.19231 0 73
    913051 citizenship 440024 2000 1 5.581395 0 5
    905031 literature 440479 1998 1 7 0 75
    11101 literature 140087 1998 1 2.700315 0 50
    905031 literature 344572 2001 1 7.142857 0 72
    904441 literature 144683 2001 2 9.585714 0 86
    911551 history 648071 2001 10.62044 0 13
    908653 english 540294 1995 5 49
    35204 math 640102 1997 2 1 87
    845202 computers 140731 1995 2 77
    911501 history 247031 1998 14.45 0 25
    35103 math 480020 1997 1 1 17
    913061 history 490037 1999 1 16.3834 0 74
    35302 math 660126 1999 3 1.363636 0 98
    913051 citizenship 140269 1996 1 5.253968 0 73
    Each school has many different exams, subjects and students.

    Each observation receives the same grade as all other observations with the same Year, Exam_id and School. In other words, the grade is a class average, and is not unique to each student.
    In some years new Exam_ids were introduced and replaced old exam_ids. In these observations, I would like to do the following:
    1. Create a binary variable sameunits, which equals 1 if the new Exam_id switched a different Exam_id in a given school in the previous year that had the same Subject_id and same Exam_units. For example: if in 1995 school no. 72 taught exam no. 82 which was a 5 unit exam in subject "Math", and in 1996, the following year, the exam_id in the school was changed (indicated by new_exam==1), so that the exam_id now equals 91 but the subject is still "Math", the variable sameunits in relevant observations in 1996 would equal 1.
    2. In these cases, I would like to switch the grade in exam 91 in year t (for example in school_id 72 in year1996) with the class-wide grade from the old Exam_id (82) in year t-1, again only if the unit number is equal for both (5, in this case).
    Note that not in all cases in which a new exam is introduced was there a similar exam with the same amount of units in the year before. So New_exam can equal 1, even if the new_exam is entirely new and did not replace any similar old exams.

    I have tried various approaches and sorting methods but have yet to find a solution for this problem.
    Thank you very much.

  • #2
    I'm not sure I understand what you want in 1. It might be something like this:
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long exam_id str12 subject long school_id int year byte exam_units float grade byte(new_exam student_id)
    911551 "history "     148056 2001 .     23.7 0 93
     35103 "math "        440248 1997 1        . 1 45
      2211 "math "        140186 1998 2 17.80488 0  7
     43002 "biology "     260307 2002 .  -.89474 0 34
    913061 "history "     270298 1995 1        . . 97
     35103 "math "        260398 1999 1 -4.19231 0 73
    913051 "citizenship " 440024 2000 1 5.581395 0  5
    905031 "literature "  440479 1998 1        7 0 75
     11101 "literature "  140087 1998 1 2.700315 0 50
    905031 "literature "  344572 2001 1 7.142857 0 72
    904441 "literature "  144683 2001 2 9.585714 0 86
    911551 "history "     648071 2001 . 10.62044 0 13
    908653 "english "     540294 1995 5        . . 49
     35204 "math "        640102 1997 2        . 1 87
    845202 "computers "   140731 1995 2        . . 77
    911501 "history "     247031 1998 .    14.45 0 25
     35103 "math "        480020 1997 1        . 1 17
    913061 "history "     490037 1999 1  16.3834 0 74
     35302 "math "        660126 1999 3 1.363636 0 98
    913051 "citizenship " 140269 1996 1 5.253968 0 73
    end
    
    by school_id subject exam_units (year), sort: gen same_units ///
        = (new_exam == 1) & year == year[_n-1] + 1
    In any case, your example data provides no opportunity to test this code since you never even show the same school twice, let alone the more complicated conditions that you are trying to identify with this new same_units variable. If this is not what you wanted, please post back with better example data that at least contains a couple of examples where you will want same_units = 1 and a couple where you will not. Also, to make the example data more useful, use the -dataex- command to do it, as I have done here. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    As for your second request, I do not understand what "switch the grade" means here. Do you want to replace the grade in the later year with the grade in the earlier year? Or the other way around? Or do you want to swap them?

    Comment

    Working...
    X