Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to find mode of a variable based on certain conditions?

    Hi,

    Please consider the following example data:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str3 id float year str3 pcd float(school flag1 yearschool)
    "111" 2011 "AAC" 123 1  2
    "111" 2012 "AAA" 124 1  4
    "111" 2013 "AAA" 124 1  5
    "112" 2010 "AAA" 123 1  1
    "112" 2011 "AAA" 123 1  2
    "112" 2012 "AAB" 123 1  3
    "113" 2014 "AAB" 126 1  7
    "113" 2015 "AAB" 127 1  8
    "113" 2016 "AAC" 128 1 10
    "113" 2017 "AAA" 128 1 12
    "115" 2011 "AAC" 123 0  2
    "115" 2012 "AAB" 124 0  4
    "115" 2013 "AAC" 128 0  6
    "116" 2016 "AAC" 127 0  9
    "116" 2017 "AAA" 127 0 11
    "117" 2011 "AAB" 123 0  2
    "117" 2015 "AAB" 127 0  8
    end
    
    local good if flag==0
    local bad if flag==1
    I want find mode of pcd based on following conditions:
    for each id corresponding to a value of yearschool in "bad", I want to find observations in "good" having same value of yearschool, and then take the mode of the corresponding pcd.

    So for each id in "bad", we will have a modal pcd based on the ids in "good"
    I have been iterating on this but haven't managed to combine the value of yearschool with each of the macros to solve this.

    Would appreciate any help, thanks!

  • #2
    The mode is just the most common value -- except that with a categorical variable (in this case string) ties are especially likely.

    egen, mode() supports mode calculation with string variables -- a detail that goes back at least to 1998 and commands documented in Stata Technical Bulletin 50.

    It's perhaps as or more useful to approach this from first principles. Calculation of frequencies leads directly to calculation of maximum frequency, except that we check to see if the same maximum is shared by other values. That is indeed sometimes true for the data example.

    I can't see that using local macros is needed here or would be problematic either.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str3 id float year str3 pcd float(school flag1 yearschool)
    "111" 2011 "AAC" 123 1  2
    "111" 2012 "AAA" 124 1  4
    "111" 2013 "AAA" 124 1  5
    "112" 2010 "AAA" 123 1  1
    "112" 2011 "AAA" 123 1  2
    "112" 2012 "AAB" 123 1  3
    "113" 2014 "AAB" 126 1  7
    "113" 2015 "AAB" 127 1  8
    "113" 2016 "AAC" 128 1 10
    "113" 2017 "AAA" 128 1 12
    "115" 2011 "AAC" 123 0  2
    "115" 2012 "AAB" 124 0  4
    "115" 2013 "AAC" 128 0  6
    "116" 2016 "AAC" 127 0  9
    "116" 2017 "AAA" 127 0 11
    "117" 2011 "AAB" 123 0  2
    "117" 2015 "AAB" 127 0  8
    end
    
    bysort yearschool flag1 pcd : gen freq = _N 
    bysort yearschool flag1 (freq) : gen modefreq = freq[_N]
    gen mode = pcd if freq == modefreq 
    egen tag = tag(yearschool flag1 pcd)
    list yearschool flag1 mode modefreq if tag & mode != "" , sepby(yearschool flag1) noobs
    
      +------------------------------------+
      | yearsc~l   flag1   mode   modefreq |
      |------------------------------------|
      |        1       1    AAA          1 |
      |------------------------------------|
      |        2       0    AAB          1 |
      |        2       0    AAC          1 |
      |------------------------------------|
      |        2       1    AAA          1 |
      |        2       1    AAC          1 |
      |------------------------------------|
      |        3       1    AAB          1 |
      |------------------------------------|
      |        4       0    AAB          1 |
      |------------------------------------|
      |        4       1    AAA          1 |
      |------------------------------------|
      |        5       1    AAA          1 |
      |------------------------------------|
      |        6       0    AAC          1 |
      |------------------------------------|
      |        7       1    AAB          1 |
      |------------------------------------|
      |        8       0    AAB          1 |
      |------------------------------------|
      |        8       1    AAB          1 |
      |------------------------------------|
      |        9       0    AAC          1 |
      |------------------------------------|
      |       10       1    AAC          1 |
      |------------------------------------|
      |       11       0    AAA          1 |
      |------------------------------------|
      |       12       1    AAA          1 |
      +------------------------------------+

    Comment

    Working...
    X