Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • -egen anymatch or tag - does not support -by-, any alternative command to do that?

    Hello, I have data that asked students to show what tools they used in each class that they have taken in the semester. I would like to calculate (1) the percentage of students using each tool regardless of class; (2) the number of tools a given student used across all classes. The data are in long form shown below (a brief example). iid is individual ID. classid is class ID. toolid is tool ID (what is the first tool you use in the class, what is the second tool...). tool shows the specific tools students choose among the given choice (6 items). Tool names are represented by numbers 1-6.

    I tried to use the -egen anymatch- command, for example, for question (1), I thought I could do the following:
    Code:
    bysort iid: egen tool1=anymatch(tool), values(1)
    I thought in this way, I could create a variable named tool1 which is an indicator variable that shows if a student has used tool 1 or not, but the -egen anymatch- does not support -by-.
    For question (2), I thought I could do the following:
    Code:
    bysort iid: egen numtool=tag(tool)
    I thought in this way, I could know the number of distinctive tools a given student used. But the command -egen tag- also does not support -by-. I googled but did not find a solution. I was wondering if anyone knows how to use these commands by group or knows alternative ways to answer the above two questions. Thank you so much!

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float iid byte(classid toolid tool)
    1 1 1 1
    1 1 2 3
    1 1 3 .
    1 1 4 .
    1 1 5 .
    1 2 1 2
    1 2 2 .
    1 2 3 .
    1 2 4 .
    1 2 5 .
    1 3 1 6
    1 3 2 1
    1 3 3 .
    1 3 4 .
    1 3 5 .
    2 1 1 1
    2 1 2 2
    2 1 3 .
    2 1 4 .
    2 1 5 .
    2 2 1 2
    2 2 2 .
    2 2 3 .
    2 2 4 .
    2 2 5 .
    2 3 1 5
    2 3 2 3
    2 3 3 .
    2 3 4 .
    2 3 5 .
    3 1 1 2
    3 1 2 1
    3 1 3 3
    3 1 4 .
    3 1 5 .
    3 2 1 2
    3 2 2 .
    3 2 3 .
    3 2 4 .
    3 2 5 .
    3 3 1 1
    3 3 2 .
    3 3 3 .
    3 3 4 .
    3 3 5 .
    4 1 1 3
    4 1 2 6
    4 1 3 .
    4 1 4 .
    4 1 5 .
    4 2 1 2
    4 2 2 .
    4 2 3 4
    4 2 4 .
    4 2 5 .
    4 3 1 1
    4 3 2 .
    4 3 3 .
    4 3 4 .
    4 3 5 .
    5 1 1 3
    5 1 2 2
    5 1 3 .
    5 1 4 .
    5 1 5 .
    5 2 1 4
    5 2 2 .
    5 2 3 5
    5 2 4 .
    5 2 5 .
    5 3 1 3
    5 3 2 .
    5 3 3 .
    5 3 4 .
    5 3 5 .
    6 1 1 3
    6 1 2 .
    6 1 3 .
    6 1 4 .
    6 1 5 .
    6 2 1 5
    6 2 2 .
    6 2 3 6
    6 2 4 .
    6 2 5 .
    6 3 1 1
    6 3 2 .
    6 3 3 .
    6 3 4 .
    6 3 5 .
    7 1 1 6
    7 1 2 .
    7 1 3 .
    7 1 4 .
    7 1 5 .
    7 2 1 6
    7 2 2 .
    7 2 3 .
    7 2 4 .
    7 2 5 .
    end

  • #2
    1) Percentage of students using each tool. Say tool one

    Code:
    . egen tool1 = total(tool==1), by(iid)
    
    . replace tool1 = 1 if tool1>1
    (30 real changes made)
    
    . egen studenttag = tag(iid)
    
    . summ tool1 if studenttag
    
        Variable |        Obs        Mean    Std. Dev.       Min        Max
    -------------+---------------------------------------------------------
           tool1 |          7    .7142857      .48795          0          1
    2) The number of tools a given student used:
    Code:
    . egen toolstud = tag(iid tool)
    
    . egen numtools = total(toolstud), by(iid)

    Comment


    • #3
      As footnotes to the helpful post of @Joro Kolev:

      On problem 2) see also the longer discussion in https://www.stata-journal.com/articl...article=dm0042

      On problem 1) an alternative is

      Code:
      egen tool1 = max(tool == 1), by(iid)
      and a corresponding discussion can be found at https://www.stata.com/support/faqs/d...ble-recording/






      Comment


      • #4
        Originally posted by Joro Kolev View Post
        1) Percentage of students using each tool. Say tool one

        Code:
        . egen tool1 = total(tool==1), by(iid)
        
        . replace tool1 = 1 if tool1>1
        (30 real changes made)
        
        . egen studenttag = tag(iid)
        
        . summ tool1 if studenttag
        
        Variable | Obs Mean Std. Dev. Min Max
        -------------+---------------------------------------------------------
        tool1 | 7 .7142857 .48795 0 1
        2) The number of tools a given student used:
        Code:
        . egen toolstud = tag(iid tool)
        
        . egen numtools = total(toolstud), by(iid)
        Hi Joro: Thank you so much! It is very helpful!

        Comment


        • #5
          Originally posted by Nick Cox View Post
          As footnotes to the helpful post of @Joro Kolev:

          On problem 2) see also the longer discussion in https://www.stata-journal.com/articl...article=dm0042

          On problem 1) an alternative is

          Code:
          egen tool1 = max(tool == 1), by(iid)
          and a corresponding discussion can be found at https://www.stata.com/support/faqs/d...ble-recording/





          Hi Nick: Thank you so much for providing the additional materials! Those are very helpful! Now I understand these questions much better.

          Comment

          Working...
          X