Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • The problem with count if syntax regarding encoded string variables

    From the dataset hospdd given by STATA (https://www.stata-press.com/data/r17/hospdd.dta), I want to count the number of observations that satisfy the conditions: month=May and procedure =New.

    I gave some tries but they all did not work:

    Code:
    . sort hospital month
    
    . count if month="May"& procedure = "New"
    =exp not allowed
    r(101);
    
    . count if month = "May"& procedure = "New"
    =exp not allowed
    r(101);
    
    . count if (month = "May"& procedure = "New")
    month="May" invalid name
    r(198);
    
    . count if (month == "May"& procedure = ="New")
    type mismatch
    r(109);
    
    . count if month == "May"& procedure = ="New"
    type mismatch
    r(109);
    
    . count if (month == "May") & (procedure =="New")
    type mismatch
    r(109);
    
    . count if month == "May" & procedure =="New"
    type mismatch
    r(109);
    
    . count if month == May & procedure == New
    May not found
    r(111);
    
    . count if month ==May & procedure ==New
    May not found
    r(111);
    Could you please help me to sort it out?

    In addition, STATA provide the description for this dataset is that

    Code:
    A health provider is interested in studying the effect of a new hospital admissions procedure on the satisfaction of patients. The provider has monthly data on patients from January to July. The new admissions procedure was implemented in April by hospitals that were under new management. Of the 46 hospitals in the study, 18 implemented the new procedure.
    Could you please help me with the code to list the 18 hospital implemented the new procedure?
















  • #2
    With 262 posts on Statalist, you do by now know the difference between a string variable and a numerical variable with value labels. For equality testing, you need a double equal sign (==).

    Code:
    lab list
    Res.:

    Code:
    . lab list
    size:
               1 Low
               2 Medium
               3 High
               4 Very high
    mnth:
               1 January
               2 February
               3 March
               4 April
               5 May
               6 June
               7 July
    pol:
               0 Old
               1 New

    Comment


    • #3
      See https://www.statalist.org/forums/help#spelling

      The problem here is that the variables concerned are numeric with value labels. There is a syntax for invoking value labels, but I find it easiest to look,at the value labels and do things like

      Code:
      count if month == 5 & procedure == 1
      Here is a fuller script to show the point.

      Code:
      . use https://www.stata-press.com/data/r17/hospdd.dta, clear
      (Artificial hospital admission procedure data)
      
      . d
      
      Contains data from https://www.stata-press.com/data/r17/hospdd.dta
       Observations:         7,368                  Artificial hospital admission procedure data
          Variables:             5                  7 Mar 2021 19:52
      --------------------------------------------------------------------------------------------------------------
      Variable      Storage   Display    Value
          name         type    format    label      Variable label
      --------------------------------------------------------------------------------------------------------------
      hospital        byte    %9.0g                 Hospital ID
      frequency       byte    %9.0g      size       Hospital visit frequency
      month           byte    %8.0g      mnth       Month
      procedure       byte    %9.0g      pol        Admission procedure
      satis           float   %9.0g                 Patient satisfaction score
      --------------------------------------------------------------------------------------------------------------
      Sorted by: hospital
      
      . tab month procedure, nola
      
                 |  Admission procedure
           Month |         0          1 |     Total
      -----------+----------------------+----------
               1 |     1,842          0 |     1,842 
               2 |       921          0 |       921 
               3 |       921          0 |       921 
               4 |       538        383 |       921 
               5 |       538        383 |       921 
               6 |       538        383 |       921 
               7 |       538        383 |       921 
      -----------+----------------------+----------
           Total |     5,836      1,532 |     7,368 
      
      . label list
      size:
                 1 Low
                 2 Medium
                 3 High
                 4 Very high
      mnth:
                 1 January
                 2 February
                 3 March
                 4 April
                 5 May
                 6 June
                 7 July
      pol:
                 0 Old
                 1 New
      
      . count if procedure == 1 & month == 5
        383
      
      . count if procedure =="New":pol & month == "May":mnth
        383

      Watch out that

      == is the operator to test for equality, not = or = =

      month == May works if and only if month and May are both numeric variables or both string variables (there are further exceptional cases, not relevant here)

      month == "May" works if and only if month is a string variable (or string scalar)


      Comment


      • #4
        Dear Nick Cox and Andrew Musau

        Thank you so much for your kind help. It is very clear to me now that I made a mistake regarding the type of variables and the improper usage of equal sign (=).

        Nick Cox : yes, sorry Stata not STATA

        Comment

        Working...
        X