Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using a loop to assign a value from a variable to a new variable using a condition

    Hi all,

    I have the following data set on courses and games being played in them:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str1(courseida courseidb) float(users_playing_both sumspecific_course1 sumspecific_course2 sumspecific_course3 sumspecific_course4 sumspecific_course5)
    "1" "2"  0 331 126 3535 3521 434
    "1" "3"  0 331 126 3535 3521 434
    "1" "4"  0 331 126 3535 3521 434
    "1" "5"  2 331 126 3535 3521 434
    "2" "3"  2 331 126 3535 3521 434
    "2" "4"  2 331 126 3535 3521 434
    "2" "5"  0 331 126 3535 3521 434
    "3" "4" 30 331 126 3535 3521 434
    "3" "5"  1 331 126 3535 3521 434
    "4" "5"  0 331 126 3535 3521 434
    "1" "2"  0 331 126 3535 3521 434
    "1" "3"  0 331 126 3535 3521 434
    "1" "4"  0 331 126 3535 3521 434
    "1" "5"  2 331 126 3535 3521 434
    "2" "3"  2 331 126 3535 3521 434
    "2" "4"  2 331 126 3535 3521 434
    "2" "5"  0 331 126 3535 3521 434
    "3" "4" 30 331 126 3535 3521 434
    "3" "5"  1 331 126 3535 3521 434
    "4" "5"  0 331 126 3535 3521 434
    "1" "2"  0 331 126 3535 3521 434
    "1" "3"  0 331 126 3535 3521 434
    "1" "4"  0 331 126 3535 3521 434
    "1" "5"  2 331 126 3535 3521 434
    "2" "3"  2 331 126 3535 3521 434
    "2" "4"  2 331 126 3535 3521 434
    "2" "5"  0 331 126 3535 3521 434
    "3" "4" 30 331 126 3535 3521 434
    "3" "5"  1 331 126 3535 3521 434
    "4" "5"  0 331 126 3535 3521 434
    "1" "2"  0 331 126 3535 3521 434
    "1" "3"  0 331 126 3535 3521 434
    "1" "4"  0 331 126 3535 3521 434
    "1" "5"  2 331 126 3535 3521 434
    "2" "3"  2 331 126 3535 3521 434
    "2" "4"  2 331 126 3535 3521 434
    "2" "5"  0 331 126 3535 3521 434
    "3" "4" 30 331 126 3535 3521 434
    "3" "5"  1 331 126 3535 3521 434
    "4" "5"  0 331 126 3535 3521 434
    "1" "2"  0 331 126 3535 3521 434
    "1" "3"  0 331 126 3535 3521 434
    "1" "4"  0 331 126 3535 3521 434
    "1" "5"  2 331 126 3535 3521 434
    "2" "3"  2 331 126 3535 3521 434
    "2" "4"  2 331 126 3535 3521 434
    "2" "5"  0 331 126 3535 3521 434
    "3" "4" 30 331 126 3535 3521 434
    "3" "5"  1 331 126 3535 3521 434
    "4" "5"  0 331 126 3535 3521 434
    "1" "2"  0 331 126 3535 3521 434
    "1" "3"  0 331 126 3535 3521 434
    "1" "4"  0 331 126 3535 3521 434
    "1" "5"  2 331 126 3535 3521 434
    "2" "3"  2 331 126 3535 3521 434
    "2" "4"  2 331 126 3535 3521 434
    "2" "5"  0 331 126 3535 3521 434
    "3" "4" 30 331 126 3535 3521 434
    "3" "5"  1 331 126 3535 3521 434
    "4" "5"  0 331 126 3535 3521 434
    "1" "2"  0 331 126 3535 3521 434
    "1" "3"  0 331 126 3535 3521 434
    "1" "4"  0 331 126 3535 3521 434
    "1" "5"  2 331 126 3535 3521 434
    "2" "3"  2 331 126 3535 3521 434
    "2" "4"  2 331 126 3535 3521 434
    "2" "5"  0 331 126 3535 3521 434
    "3" "4" 30 331 126 3535 3521 434
    "3" "5"  1 331 126 3535 3521 434
    "4" "5"  0 331 126 3535 3521 434
    "1" "2"  0 331 126 3535 3521 434
    "1" "3"  0 331 126 3535 3521 434
    "1" "4"  0 331 126 3535 3521 434
    "1" "5"  2 331 126 3535 3521 434
    "2" "3"  2 331 126 3535 3521 434
    "2" "4"  2 331 126 3535 3521 434
    "2" "5"  0 331 126 3535 3521 434
    "3" "4" 30 331 126 3535 3521 434
    "3" "5"  1 331 126 3535 3521 434
    "4" "5"  0 331 126 3535 3521 434
    "1" "2"  0 331 126 3535 3521 434
    "1" "3"  0 331 126 3535 3521 434
    "1" "4"  0 331 126 3535 3521 434
    "1" "5"  2 331 126 3535 3521 434
    "2" "3"  2 331 126 3535 3521 434
    "2" "4"  2 331 126 3535 3521 434
    "2" "5"  0 331 126 3535 3521 434
    "3" "4" 30 331 126 3535 3521 434
    "3" "5"  1 331 126 3535 3521 434
    "4" "5"  0 331 126 3535 3521 434
    "1" "2"  0 331 126 3535 3521 434
    "1" "3"  0 331 126 3535 3521 434
    "1" "4"  0 331 126 3535 3521 434
    "1" "5"  2 331 126 3535 3521 434
    "2" "3"  2 331 126 3535 3521 434
    "2" "4"  2 331 126 3535 3521 434
    "2" "5"  0 331 126 3535 3521 434
    "3" "4" 30 331 126 3535 3521 434
    "3" "5"  1 331 126 3535 3521 434
    "4" "5"  0 331 126 3535 3521 434
    end
    sumspecific_course_X variables refer to the total number of users playing in each of the 5 courses of the data set while the users_playing_both variable refers to number of games that users have played in both course a and course b. coursea and courseb include all the possible unique pairs of courses. My goal is to use the sumspecific_course_X variables to create a data set that looks like this:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str1(courseida courseidb) float(users_playing_both users_playing_course_a users_playing_course_b sumspecific_course1 sumspecific_course2 sumspecific_course3 sumspecific_course4 sumspecific_course5)
    "1" "2"  0  331  126 331 126 3535 3521 434
    "1" "3"  0  331 3535 331 126 3535 3521 434
    "1" "4"  0  331 3521 331 126 3535 3521 434
    "1" "5"  2  331  434 331 126 3535 3521 434
    "2" "3"  2  126 3535 331 126 3535 3521 434
    "2" "4"  2  126 3521 331 126 3535 3521 434
    "2" "5"  0  126  434 331 126 3535 3521 434
    "3" "4" 30 3535 3521 331 126 3535 3521 434
    "3" "5"  1 3535  434 331 126 3535 3521 434
    "4" "5"  0 3521  435 331 126 3535 3521 434
    "1" "2"  0  331  126 331 126 3535 3521 434
    "1" "3"  0  331 3535 331 126 3535 3521 434
    "1" "4"  0  331 3521 331 126 3535 3521 434
    "1" "5"  2  331  434 331 126 3535 3521 434
    end
    I would appreciate any advice on how to create a loop that achieves this since I am trying to run the code on a data set with more than 5 courses.

    Thank you in advance.
    Last edited by Asteris Dougalis; 28 Aug 2022, 07:58.

  • #2
    I don't understand the data you have, nor what you want to do with it.

    The example starting data you show consists of 10 copies of the first ten observations. What's that about?

    The data you show as your goal is just the exact same ten observations, followed by an extra copy of each of the first 4. What's that about?

    There doesn't seem to be any assigning of any value of anything to anything else going on. The mystery is why you have so many copies of exactly the same data to start with and why you want to keep a single copy of some of them but a second copy of a few others selected in some less than obvious way.

    Comment


    • #3
      Hi Clyde. Apologies for the lack of clarity.

      The data I currently have (first example) starts with the pairings of courses. Each number represents a courseid. There are 5 unique courseids (1-5) and the "courseida and courseidb" variables represent the possible combination of each courseid. "users_playing_both" is a variable that includes the number of times a user has played in both "courseida" and "courseb". For example, no users have played in both course 1 and course 2 (as seen in the first row) while 30 users have played in both course 3 and course 4 (as seen in the 8th row). The remaning variables show the total number of unique users who have played each course. "sumspecific_course1" refers to the number of users who have played course1; "sumspecific_course2" refer to the number of users who have played course2 and so on. My goal is to use the "sumspecific_coursex" variables to create two new variables: "number_of_users_who_played_course_a" and "number_of_users_who_played_course_b". So if we are looking at the first row where "courseida" is 1 and "courseid2" is 2 I want the "number_of_users_who_played_course_a" variable to be equal to "sumspecific_course1" and "number_of_users_who_played_course_b" to be equal to "sumspecific_course2".

      The ultimate goal of this exercise is to create "markets" of courses in order by seeing how often users who play a certain course play a course in the same "market". This would naturally be done using GIS (and I have not included some relevant data here like long and lat) but the purpose of my question is to help develop the dataset that will later be analyzed using GIS. I hope my explanation has been helpful.

      Comment


      • #4
        OK. That's a bit clearer. I still don't understand why you have so many duplicates observations in the data, and to simplify the coding, I'm going to remove them all first.

        Code:
        duplicates drop
        
        destring courseid*, replace
        reshape long sumspecific_course, i(courseida courseidb) j(_j)
        keep if inlist(_j, courseida, courseidb)
        gen number_played_course_a = sumspecific_course if courseida == _j
        gen number_played_course_b = sumspecific_course if courseidb == _j
        collapse (firstnm) number_played_* (first) users_playing_both, by(courseida courseidb)

        Comment


        • #5
          Thank you! Thank worked.

          Comment

          Working...
          X