Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Type Mismatch Error with tabcount

    The data contain several unique classes "class_id" (byte, values 1-20) with students "student_id" (byte, values 1-30, 1-29, 1-24, etc.).

    Resp1 - Resp3 contain byte values of student_id.

    Here is a snapshot of the data.
    class_id student_id Resp1 Resp2 Resp3
    1 1 9 7 5
    1 2 14 4 22
    1 3 1 . 22
    1 28 30 1 1
    1 29 2 22 4
    1 30 9 . 18
    2 1 3 15 13
    2 2 17 25 1
    2 3 3 17 9
    I would like to count the frequency each student_id across observations of Resp1-Resp3 within each class_id.
    My goal in using tabcount is to eventually create a variable that contains frequency of student_id for each observation of a student_id in a class_id.

    Essentially, this command below, but across all Resp variables, with student_ids with zero occurrences included, missing excluded from display.
    bysort student_id (class_id): tab PN1, matcell(matrix`Class')
    svmat matrix`Class'


    My thought is tabcount should be able to handle this. But when I run the following, I receive a type mismatch r(109) error. None of the variables are string. Could it be I'm misusing a macro? (I also don't quite understand how what arguments c() or v() are supposed to take.)

    vl create Resp_list = (Resp1 Resp2 Resp3)
    foreach var in $Resp_list{
    di "Var is: `var'"
    bysort class_id (student_id): tabcount `var', c(`var') zero
    }


    Any help is appreciated!

  • #2
    tabcount is from SSC. You've tried to use a c() option, but such an option is for a series of equalities or inequalities and a variable is not such. So, you have got tabcount confused. I am the author of tabcount and I am no clearer on why you are trying to use it.

    Please show what output you expect for

    Code:
    bysort class_id (student_id): tabcount Resp1, c(Resp1) zero


    You want a series of tables it seems for each class_id and each of Resp1 Resp2 Resp3. How many tables will that be and what will you do with them?

    Comment


    • #3
      Thanks for the response.

      I now understand that c() must take an inequality. I worked out the v() option instead.

      I would like to use tabcount to answer the question: within each class_id, how many times does each student_id occur across Resp1-Resp3?

      tabulate does not display frequencies for observations with 0 occurrences.

      So I did this (below), which I can tweak in a larger forvar loop across the Resp1-Resp3 variables. I'll probably use the matrix option to store the results or push the matrix to a variable.

      forvar in $Resp_list{
      foreach level in `class_id'{
      sum student_id if class_number==`l'
      local id_max=r(max)
      display `l'
      tabcount `var' if class_id==`l', v(1/`id_max') zero
      }
      }


      I've been able to create the desired output, more or less.

      Comment


      • #4
        Sounds good. I don't know (or have forgotten) where forvar comes from. Perhaps that's a typo and there seem to be others in the code.

        I would rewrite this as


        Code:
        foreach var in Resp1 Resp2 Resp3 { 
        
            foreach level in `class_id' {
                  display `level' 
                  sum student_id if class_id == `level', meanonly 
                  tabcount `var' if class_id == `level', v(1/`r(max)') zero
            }
        
        }
        But it seems that your real code worked "more or less".

        Comment

        Working...
        X