Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Create Dataset in Stata using a Loop

    I'm having trouble finding guidance on this. I want to create a dataset that has a variable called "location" and contains the values 1, 3, 4, 6, 7 with another variable called "action" that contains values1, 2, 3, 4, 5. I want my dataset to contain every combination of these, but I do not want to input manually. Any ideas?

    location action
    1 1
    1 2
    1 3
    1 4
    1 5
    3 1
    3 2
    ...

  • #2
    Code:
    clear
    input x y
    1 1
    3 2
    4 3
    6 4
    7 5
    end
    fillin x y

    Comment


    • #3
      Paul Dickman --I don't think this is what I want. My final dataset from the example I provided should have 25 observations--containing every possible combination of the values in the two variables. And I was hoping to do this in a loop instead of typing manually. I simplified the example I included--my actual data has closer to 100 values (non consecutive) for variable 1, which would make it very cumbersome to type the way you did.

      Comment


      • #4
        Code:
        set obs 5
        gen location=.
        local i 1
        foreach num of numlist 1 3 4 6 7{
            replace location =`num' in `i'
            local ++i
        }
        expand 5
        bys location: gen action=_n
        Res.:

        Code:
        . l
        
             +-------------------+
             | location   action |
             |-------------------|
          1. |        1        1 |
          2. |        1        2 |
          3. |        1        3 |
          4. |        1        4 |
          5. |        1        5 |
             |-------------------|
          6. |        3        1 |
          7. |        3        2 |
          8. |        3        3 |
          9. |        3        4 |
         10. |        3        5 |
             |-------------------|
         11. |        4        1 |
         12. |        4        2 |
         13. |        4        3 |
         14. |        4        4 |
         15. |        4        5 |
             |-------------------|
         16. |        6        1 |
         17. |        6        2 |
         18. |        6        3 |
         19. |        6        4 |
         20. |        6        5 |
             |-------------------|
         21. |        7        1 |
         22. |        7        2 |
         23. |        7        3 |
         24. |        7        4 |
         25. |        7        5 |
             +-------------------+

        Comment


        • #5
          Rebecca Ivester

          So from post #3 we learn that in fact you have a dataset containing 100 values for the variable location. The following example starts with 100 values of location and produces what you want, using fillin as presented in post #2. Note that the list commands print only the beginning and end of each dataset to save space in this post.
          Code:
          . keep location
          
          . generate action = _n in 1/5
          (95 missing values generated)
          
          . list if _n<13 | _n>97, separator(0)
          
               +-------------------+
               | location   action |
               |-------------------|
            1. |        1        1 |
            2. |        2        2 |
            3. |        5        3 |
            4. |        7        4 |
            5. |       10        5 |
            6. |       13        . |
            7. |       16        . |
            8. |       17        . |
            9. |       19        . |
           10. |       22        . |
           11. |       25        . |
           12. |       28        . |
           98. |      210        . |
           99. |      212        . |
          100. |      214        . |
               +-------------------+
          
          . fillin location action
          
          . drop _fillin
          
          . drop if action==.
          (100 observations deleted)
          
          . list if _n<13 | _n>487, sepby(location)
          
               +-------------------+
               | location   action |
               |-------------------|
            1. |        1        1 |
            2. |        1        2 |
            3. |        1        3 |
            4. |        1        4 |
            5. |        1        5 |
               |-------------------|
            6. |        2        1 |
            7. |        2        2 |
            8. |        2        3 |
            9. |        2        4 |
           10. |        2        5 |
               |-------------------|
           11. |        5        1 |
           12. |        5        2 |
               |-------------------|
          488. |      210        3 |
          489. |      210        4 |
          490. |      210        5 |
               |-------------------|
          491. |      212        1 |
          492. |      212        2 |
          493. |      212        3 |
          494. |      212        4 |
          495. |      212        5 |
               |-------------------|
          496. |      214        1 |
          497. |      214        2 |
          498. |      214        3 |
          499. |      214        4 |
          500. |      214        5 |
               +-------------------+
          
          .

          Comment


          • #6
            Originally posted by Rebecca Ivester View Post
            Paul Dickman --I don't think this is what I want. My final dataset from the example I provided should have 25 observations--containing every possible combination of the values in the two variables. And I was hoping to do this in a loop instead of typing manually.
            When I run my code I get 25 observations, one for each combination of the two variables. I'm using Stata 16.1 Windows, but I find it difficult to imagine the code would give different results with other versions (assuming version 6 or later) or operating systems.

            I know you said you wanted to do this in a loop, but why not allow -fillin- to do the loop?

            Code:
            . clear
            
            . input x y
            
                         x          y
              1. 1 1
              2. 3 2
              3. 4 3
              4. 6 4
              5. 7 5
              6. end
            
            . fillin x y
            
            . list
            
                 +-----------------+
                 | x   y   _fillin |
                 |-----------------|
              1. | 1   1         0 |
              2. | 1   2         1 |
              3. | 1   3         1 |
              4. | 1   4         1 |
              5. | 1   5         1 |
                 |-----------------|
              6. | 3   1         1 |
              7. | 3   2         0 |
              8. | 3   3         1 |
              9. | 3   4         1 |
             10. | 3   5         1 |
                 |-----------------|
             11. | 4   1         1 |
             12. | 4   2         1 |
             13. | 4   3         0 |
             14. | 4   4         1 |
             15. | 4   5         1 |
                 |-----------------|
             16. | 6   1         1 |
             17. | 6   2         1 |
             18. | 6   3         1 |
             19. | 6   4         0 |
             20. | 6   5         1 |
                 |-----------------|
             21. | 7   1         1 |
             22. | 7   2         1 |
             23. | 7   3         1 |
             24. | 7   4         1 |
             25. | 7   5         0 |
                 +-----------------+

            Comment

            Working...
            X