Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to add up values for same ID in a column with duplicates

    Hello,

    I am relatively new to using STATA and I am having trouble with adding up data in a column when there are duplicates. For instance, I want to add up all the values for HHID 1013000201 (200000 +50000 + 30000). Thank you for any advice!



    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str12 HHID double total_asset
    "1013000201"      200000
    "1013000201"       50000
    "1013000201"       30000
    "1013000204"      150000
    "1013000204"       80000
    "1013000204"       20000
    "1013000204"       80000
    "1013000204"       10000
    "1013000204"       50000
    "1013000204"       60000
    "1013000204"      300000
    "1013000204"       70000
    "1013000206"     2500000
    "1013000206"     4500000
    "1013000206"      340000
    "1013000206"      150000
    "1013000206"      180000
    "1013000206"     1500000
    "1013000206"       30000
    "1013000206"      150000
    "1013000206"       70000
    "1013000206"      250000
    "1013000210"      300000
    "1013000210"     6000000
    "1013000210"       10000
    "1013000210"      500000
    "1013000210"       40000
    "1013000210"     2000000
    "1013000210"       40000
    "1013000210"      850000
    "1013000213"       70000
    "1013000213"       30000
    "101300021302"     15000
    "101300021302"     15000
    "101300021302"     60000
    "101300021302"     30000
    "1021000102"    90000000
    "1021000102"     6000000
    "1021000102"      150000
    "1021000102"      800000
    "1021000102"     1000000
    "1021000102"     1000000
    "1021000102"    38000000
    "1021000102"     2300000
    "1021000102"     2000000
    "1021000102"     1200000
    "1021000102"      900000
    "1021000102"      900000
    "1021000108"    50000000
    "1021000108"    50000000
    "1021000108"      350000
    "1021000108"      400000
    "1021000108"      250000
    "1021000108"      150000
    "1021000108"     6000000
    "1021000108"       30000
    "1021000108"      300000
    "1021000109"    15000000
    "1021000109"     4000000
    "1021000109"      120000
    "1021000109"       20000
    "1021000109"      190000
    "1021000109"       40000
    "1021000109"        7000
    "1021000109"      200000
    "1021000109"       80000
    "1021000109"      100000
    "1021000110"    20000000
    "1021000110"      500000
    "1021000110"       25000
    "1021000110"      300000
    "1021000110"       10000
    "1021000110"       40000
    "1021000110"      320000
    "1021000111"     5000000
    "1021000111"    17000000
    "1021000111"    65000000
    "1021000111"      785000
    "1021000111"      220000
    "1021000111"      200000
    "1021000111"       25000
    "1021000111"       90000
    "1021000111"      190000
    "1021000111"      800000
    "1021000111"      100000
    "1021000113"    15000000
    "1021000113"    55000000
    "1021000113"   1.700e+08
    "1021000113"     1000000
    "1021000113"       15000
    "1021000113"       70000
    "1021000113"      100000
    "1021000113"       40000
    "1021000113"      198000
    "1021000113"      350000
    "1021000201"     2000000
    "1021000201"       30000
    "1021000201"       15000
    "1021000201"      200000
    "1021000201"      110000
    end
    ------------------ copy up to and including the previous line ------------------

    Listed 100 out of 16326 observations
    Use the count() option to list more



  • #2
    Try:
    Code:
    bysort HHID: egen hh_total=total(total_asset)
    Stata/MP 14.1 (64-bit x86-64)
    Revision 19 May 2016
    Win 8.1

    Comment


    • #3
      Hi Carole,

      Thank you! It worked! I am trying to find the average for each of the HHID variable. For instance, for HHID 1013000201, I want the average between 44, 17, 11, 9, ,6, 28.



      my code is

      bysort HHID: egen hh_age_average =mean(HHID)


      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input str12 HHID byte hh_age
      "1013000201"   44
      "1013000201"   17
      "1013000201"   11
      "1013000201"    9
      "1013000201"    6
      "1013000201"   28
      "1013000204"    0
      "1013000204"   32
      "1013000204"    6
      "1013000204"    2
      "1013000204"   34
      "1013000206"   29
      "1013000210"   38
      "1013000213"   31
      "101300021302"  5
      "101300021302" 28
      "101300021302"  0
      "101300021302" 35
      "1021000102"   19
      "1021000102"   39
      "1021000102"   19
      "1021000102"   37
      "1021000102"    2
      "1021000102"   10
      "1021000108"   70
      "1021000108"   23
      "1021000108"   25
      "1021000108"   17
      "1021000108"   59
      "1021000108"   20
      "1021000109"    1
      "1021000109"   19
      "1021000109"   16
      "1021000109"   12
      "1021000109"   22
      "1021000109"   39
      "1021000110"   23
      "1021000110"   53
      "1021000110"   19
      "1021000110"   25
      "1021000110"   44
      "1021000110"   21
      "1021000111"   15
      "1021000111"   47
      "1021000111"    5
      "1021000111"   16
      "1021000111"   15
      "1021000111"   25
      "1021000111"   13
      "1021000111"   17
      "1021000111"   18
      "1021000111"   20
      "1021000113"   17
      "1021000113"   12
      "1021000113"   32
      "1021000113"   47
      "1021000113"   43
      "1021000113"   10
      "1021000113"   20
      "1021000113"    4
      "1021000113"   14
      "1021000201"   38
      "1021000201"    9
      "1021000201"   14
      "1021000201"   41
      "1021000201"   16
      "1021000202"   43
      "1021000202"   14
      "1021000202"   20
      "1021000203"   12
      "1021000203"    7
      "1021000203"    2
      "1021000203"    4
      "1021000203"   37
      "1021000203"   58
      "1021000203"   14
      "1021000203"   20
      "1021000203"   27
      "1021000203"   17
      "102100020304" 24
      "102100020304"  3
      "102100020304" 27
      "102100020304"  6
      "1021000207"   41
      "1021000207"   17
      "1021000207"   13
      "1021000207"   38
      "1021000209"   59
      "1021000210"   35
      "1021000212"    4
      "1021000212"   15
      "1021000212"   48
      "1021000212"   17
      "1021000213"   20
      "1021000213"    3
      "1021000213"   21
      "1021000213"   25
      "1021000213"   50
      "1021000213"    5
      "1021000213"   25
      end

      Comment


      • #4
        You don’t want the mean of HHID, but the mean of hh_age:

        Code:
         bysort HHID: egen hh_age_average =mean(hh_age)
        Stata/MP 14.1 (64-bit x86-64)
        Revision 19 May 2016
        Win 8.1

        Comment


        • #5
          Try:
          Code:
          bysort HHID: egen hh_age_average =mean(hh_age)
          . li if HHID == "1013000201"
          
               +--------------------------------+
               |       HHID   hh_age   hh_age~e |
               |--------------------------------|
            1. | 1013000201       44   19.16667 |
            2. | 1013000201       17   19.16667 |
            3. | 1013000201       11   19.16667 |
            4. | 1013000201        9   19.16667 |
            5. | 1013000201        6   19.16667 |
               |--------------------------------|
            6. | 1013000201       28   19.16667 |
               +--------------------------------+
          Martyn

          Comment

          Working...
          X