Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • "Copying" data inside a variable depending on two other variables

    Dear Statalist-Members,

    I'm currently working on making a rather simple prognosis of the number of students who are studying in the prescribed period of study, seperated by the course of studies. The course-ID serves as the identifier.
    I do have data from the winter semester of 2015/2016 (in data: 20152) to the winter semester of 2021/22 (in data: 20212). To prognose the number of students for the summer semester of 2022 for courses with a prescribed period of study of 6 semesters,
    I add the numbers of students who began studying in the last 6 semesters per course (numbers winter semester 2021/22 + numbers summer semester 2021 + numbers winter semester 2020/2021 + .. you get the idea) and multiply it by a "attrition rate". This works just fine!
    For prognosing the number for the following semester, the winter semester 2022/23, I am required to replace the missing data of the summer semester 2022 with the data of the summer semester 2021, which is where I am stuck for days now.
    How can I "copy" the "number of students", depending on the couse-ID and the semester, into number of students (or a copy of number of students) for a different semester, by course_ID?
    Phrased differently: How can I tell stata "Hey Stata, please copy "number of students" of "semester" = 20211 by "course_ID" and write the numbers into "number of students" for "semester" = 20222 by "course-ID"?
    The only way I found was copying the necessary data into excel, change the semester through find-and-replace, save it into a new dataset and merge the two datasets. There has to be a better way!

    This is how the data looks like (numbers are made up due to data privacy reasons):
    course_ID semester number_of_students
    123 20152 374
    246 20152 324
    123 20161 224
    246 20161 577
    ... ..... ...
    123 20212 233
    246 20212 455
    123 20221 223
    246 20221 445
    123 20222
    246 20222
    ... ...
    ... ....
    Thank you all in advance,
    best regards

    Jaqueline
    Last edited by Jaqueline Brossart; 08 Jun 2022, 09:07.

  • #2
    Welcome to Statalist.

    In future, please kindly follow the FAQ (http://www.statalist.org/forums/help) and use a command called dataex to provide sample data in code form (rather than in table) so that other users can create the data and work on it right away.

    I hope this would work. The first chunk is just to create the data set, feel free to ignore them and jump to the line starting with "bysort":

    Code:
    clear
    input course_id semester nos
    123 202201 555
    246 202201 444
    123 202202 .
    246 202202 .
    end
    
    bysort course_id (semester): replace nos = nos[_n-1] if semester == 202202 & semester[_n-1] == 202201
    
    list
    Results:

    Code:
         +---------------------------+
         | course~d   semester   nos |
         |---------------------------|
      1. |      123     202201   555 |
      2. |      123     202202   555 |
      3. |      246     202201   444 |
      4. |      246     202202   444 |
         +---------------------------+
    Last edited by Ken Chui; 08 Jun 2022, 09:16.

    Comment


    • #3
      Hi Ken,

      thank you so much for your quick and precise answer, it works out fine!

      Comment

      Working...
      X