Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating a loop that loops around a set of values of a variable

    Hello all,

    I'm trying to create a loop such that for each set of 4 through 6 a unique id (starting at 1, ending at N) is assigned for all those values inside the set, regardless of the name of that row. Example data:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str3 name byte(tau id)
    "AAA" 4 1
    "AAA" 5 1
    "AAA" 6 1
    "AAA" 4 2
    "AAA" 5 2
    "AAA" 6 2
    "AAA" 4 3
    "AAA" 5 3
    "AAA" 6 3
    "BBB" 4 4
    "BBB" 5 4
    "BBB" 6 4
    "CCC" 4 5
    "CCC" 5 5
    "CCC" 6 5
    "CCC" 4 6
    "CCC" 5 6
    "CCC" 6 6
    end

    etc. Data is sorted like this. The variable ID here is manually assigned for sake of example, but in the real dataset it is about 350,000 observations.

    I've tried both
    Code:
    foreach
    Code:
    forvalues
    and
    Code:
    levels
    But none seem to cycle through each set of 4-6 independently, instead assigning the same ID number to all sets (in my case, 1). I've tried the methods in https://www.stata.com/support/faqs/d...-with-foreach/, but to no avail. I think I'm overlooking something rather easy, could anyone point me in the right direction?

    Thanks in advance,

    Bas de Kruijf

  • #2
    I'm not understadning what you mean by "each set of 4 through 6," but I'm going to guess that you want a sub_id within id. which you could do with:
    Code:
    bysort id: gen int sub_id = _n

    Comment


    • #3
      I'm not sure if I understand the problem entirely, but if you can exploit the sort order as in your data example (i.e. values of tau are always 4 5 6 4 5 6 etc), the solution could be as simple as:
      Code:
      gen wanted = sum(tau == 4)
      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input str3 name byte(tau id)
      "AAA" 4 1
      "AAA" 5 1
      "AAA" 6 1
      "AAA" 4 2
      "AAA" 5 2
      "AAA" 6 2
      "AAA" 4 3
      "AAA" 5 3
      "AAA" 6 3
      "BBB" 4 4
      "BBB" 5 4
      "BBB" 6 4
      "CCC" 4 5
      "CCC" 5 5
      "CCC" 6 5
      "CCC" 4 6
      "CCC" 5 6
      "CCC" 6 6
      end
      
      gen wanted = sum(tau == 4)
      list, sepby(id)
           +--------------------------+
           | name   tau   id   wanted |
           |--------------------------|
        1. |  AAA     4    1        1 |
        2. |  AAA     5    1        1 |
        3. |  AAA     6    1        1 |
           |--------------------------|
        4. |  AAA     4    2        2 |
        5. |  AAA     5    2        2 |
        6. |  AAA     6    2        2 |
           |--------------------------|
        7. |  AAA     4    3        3 |
        8. |  AAA     5    3        3 |
        9. |  AAA     6    3        3 |
           |--------------------------|
       10. |  BBB     4    4        4 |
       11. |  BBB     5    4        4 |
       12. |  BBB     6    4        4 |
           |--------------------------|
       13. |  CCC     4    5        5 |
       14. |  CCC     5    5        5 |
       15. |  CCC     6    5        5 |
           |--------------------------|
       16. |  CCC     4    6        6 |
       17. |  CCC     5    6        6 |
       18. |  CCC     6    6        6 |
           +--------------------------+

      Comment


      • #4
        Hi Mike,

        The variable tau is repeating itself every x times, in this example it is repeating 4 through 6. I want to assign a unique id to each repeat of the variable tau, regardless of the other variables. The ID put in the example code is manually entered. I cannot use
        Code:
        bysort id
        simply because none of the observations have an ID.

        Thanks for the input.

        Comment


        • #5
          Thanks Wouter! Works perfectly for this purpose. Much easier than building loops.

          Consider this thread solved.

          Comment

          Working...
          X