Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Deducing the mode of a categorical variable

    Hello,

    I am using panel data from Waves 1 to 3 of the UK Millennium Cohort Study and I have been struggling to deduce the mode of a categorical variable. I have tried researching how to do this online, but it appears STATA does not have a direct mode command? I have made some progress with a work around, however, I am encountering the error message ‘type mismatch’. Does anyone have any suggestions?

    The categorical variable I am interested in is ‘frequency of alcohol consumption,’ for which I have three observations per individual i.e. drinking frequency for person X in 2001 (APALDR00), 2004 (BPALDR00), and 2006 (CPALDR00).

    My end goal is to generate a new variable equal to the mode of each individual’s frequency of drinking. For example, if individual X drank '1-2 times a month' in 2001, '2-3 times a week' in 2004, and '1-2 times a month' in 2006, I aim for the generated variable to read ‘1-2 times a month’. Alternatively, if they are do not drink, the generated variable should read 'Never'.

    Moreover, if there is a missing value, or no mode available, then I would like the variable to read the most frequent entry. For instance, person Y drank 1-2 times a month in 2001, 2-3 times a week in 2004, and missing value in 2006, I would like the variable to read ‘2-3 times a week’. Or person J drank 1-2 times a month in 2001, 'everday' in 2004, and 'less than once a month' in 2006, then the generated variable should read 'less than once a month'

    Please find attached the relevant data set, log file, and my Do-File.

    Thanks in advance!

    Attached Files

  • #2
    Take a look at -help egen-, where you will find a mode(varname) function described. You might also need to use some other -egen- functions (e.g., max()) to handle the special situations you describe.

    For the future, I'd recommend against posting attachments here, per item 12.5 of the StataList FAQ for new members.

    Comment


    • #3
      A search in Stata for mode brings up many false positives but the most helpful code for you is the mode() function of egen.

      Note also community-contributed commands modes (Stata Journal) and hsmode (SSC)


      Code:
      . clear
      
      . set obs 10
      number of observations (_N) was 0, now 10
      
      . gen whatever = "A"
      
      . replace whatever = "B" in 6/8
      (3 real changes made)
      
      . replace whatever = "C" in 9/10
      (2 real changes made)
      
      . tab whatever
      
         whatever |      Freq.     Percent        Cum.
      ------------+-----------------------------------
                A |          5       50.00       50.00
                B |          3       30.00       80.00
                C |          2       20.00      100.00
      ------------+-----------------------------------
            Total |         10      100.00
      
      . modes whatever
      
      ----------------------
       whatever |      Freq.
      ----------+-----------
              A |          5
      ----------------------
      
      . hsmode whatever
      string variables not allowed in varlist;
      whatever is a string variable
      r(109);
      
      . egen mode = mode(whatever)
      
      . tab mode
      
             mode |      Freq.     Percent        Cum.
      ------------+-----------------------------------
                A |         10      100.00      100.00
      ------------+-----------------------------------
            Total |         10      100.00
      .
      Please see advice in the FAQ Advice #12 on attachments (essentially only .png is encouraged).


      I am encountering the error message ‘type mismatch’.
      Perhaps the command you used is in your log file, but typically people won't read attachments.
      Last edited by Nick Cox; 13 Jan 2023, 10:15.

      Comment


      • #4
        Will do, thanks both

        Comment

        Working...
        X