Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Counts based on 2 variables

    I have tried this a number of ways and somehow not coming up with the solution - hoping you can help! I want to create the variable "child_ct" (in red) to count the number of children for whom a mother has completed surveys. A mother can complete multiple surveys per child, but I'm just looking for the count of children. I know I can do this if I reshape the data to wide format, but I'm wondering if there is a way to do it in long format?

    Here's a dummy data set:
    Unique_ID MotherName ChildName childsvy_num childsvy_ct child_ct
    1001 Cara Leo 1 2 2
    1001 Cara Leo 2 2 2
    1002 Cara Ezra 1 1 2
    1003 Teresa Jonah 1 1 1
    1004 Marie Molly 1 1 1
    1005 Genny Jeremy 1 1 3
    1006 Genny Kate 1 1 3
    1007 Genny George 1 2 3
    1007 Genny George 2 2 3
    Thanks in advance!
    Kathy

  • #2
    Indeed. This is a much discussed problem and here is one solution.


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int unique_id str6(mothername childname) byte(childsvy_num childsvy_ct child_ct)
    1001 "Cara"   "Leo"    1 2 2
    1001 "Cara"   "Leo"    2 2 2
    1002 "Cara"   "Ezra"   1 1 2
    1003 "Teresa" "Jonah"  1 1 1
    1004 "Marie"  "Molly"  1 1 1
    1005 "Genny"  "Jeremy" 1 1 3
    1006 "Genny"  "Kate"   1 1 3
    1007 "Genny"  "George" 1 2 3
    1007 "Genny"  "George" 2 2 3
    end
    
    egen tag = tag(mothername childname)
    egen wanted = total(tag), by(mothername)
    
    list, sepby(mothername)
    
         +--------------------------------------------------------------------------------+
         | unique~d   mother~e   childn~e   childs~m   childs~t   child_ct   tag   wanted |
         |--------------------------------------------------------------------------------|
      1. |     1001       Cara        Leo          1          2          2     1        2 |
      2. |     1001       Cara        Leo          2          2          2     0        2 |
      3. |     1002       Cara       Ezra          1          1          2     1        2 |
         |--------------------------------------------------------------------------------|
      4. |     1003     Teresa      Jonah          1          1          1     1        1 |
         |--------------------------------------------------------------------------------|
      5. |     1004      Marie      Molly          1          1          1     1        1 |
         |--------------------------------------------------------------------------------|
      6. |     1005      Genny     Jeremy          1          1          3     1        3 |
      7. |     1006      Genny       Kate          1          1          3     1        3 |
      8. |     1007      Genny     George          1          2          3     1        3 |
      9. |     1007      Genny     George          2          2          3     0        3 |
         +--------------------------------------------------------------------------------+
    For more discussion, see https://www.stata-journal.com/sjpdf....iclenum=dm0042 especially p.563

    Comment


    • #3
      Perfection! Thank you so much for the lesson - it was the "by(mothername)" piece that I was missing. Cheers.

      Comment

      Working...
      X