Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Sum by ID in one data file

    Hello everyone,

    I have dataset like this,

    ID value1 value2 value3 ....
    1 10 20 30
    1 20 20 30
    1 40 20 30
    2 10 50 80
    2 30 50 80
    3 40 30 90
    3 50 30 90
    . . . .
    . . . .
    . . . .

    As you can see, only the value of var1 is different, and the values of var2 and var3 are all the same.
    I want to make my data file like this.

    ID value1 value2 value3 ....
    1 70 20 30
    2 40 50 80
    3 90 30 90

    value1 is the sum of value1 per ID, and value2 and value3 are fixed values.


    I tried to use the duplicates and the egen function, but it did not work and I have no idea anymore.
    Is there anyone you can give me a hint about?

    Thanks in advance.

    HS Jang from South Korea

  • #2
    Maybe something like the following.
    Code:
    bysort ID: replace value1 = sum(value1)
    by ID: keep if _n == _N
    There are alternatives that are probably safer, but from what you show, those two lines of code should give you what you're asking for.

    Comment


    • #3
      Welcome to the Stata Forum/ Statalist,

      I’m not with ‘my’ Stata at this moment, but you can use - egen - with the option - total -, then thr option - tag -, finally - keep if tag - command.
      Best regards,

      Marcos

      Comment


      • #4
        Thanks Joseph, it works! Perfect! Also thank you Almeida, it gave me some insights.

        Comment


        • #5
          Here is the alternative to Joseph's solution in the lines that Marcos proposes:

          Code:
          egen newvalue1 = total(value1), by(ID)
          
          egen tag = tag(ID)
          
          keep if tag

          Comment


          • #6
            Collapse works well for this.
            Code:
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input byte(id value1 value2 value3)
            1 10 20 30
            1 20 20 30
            1 40 20 30
            2 10 50 80
            2 30 50 80
            3 40 30 90
            3 50 30 90
            end
            collapse (sum) value1 (firstnm) value2 value3, by(id)
            list, clean noobs
            Code:
            . list, clean noobs
            
                id   value1   value2   value3  
                 1       70       20       30  
                 2       40       50       80  
                 3       90       30       90

            Comment


            • #7
              Thank you Joro and William. It gave me some insights!

              Comment

              Working...
              X