Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • summing across observations for given date


    Hello everyone, I have the following panel data set:
    id date x1 newvar
    1 t=1 4 7
    1 t=2 7 21
    1 t=3 3 16
    2 t=1 1 7
    2 t=2 8 21
    2 t=3 10 16
    3 t=1 2 7
    3 t=2 6 21
    3 t=3 3 16


    I would like to do create a new variable (newvar), which reflects the following:

    For every date I would like to sum all values of x1 of the corresponding individuals (id).

    I tried with loops but this would create a new variable for each single date with the respective value. I would prefer a one variable solution, due to the high number of dates in the sample.

    Could someone please give me a solution, how to create a variable which sums across observations for a given date? Thank you very much in advance.

    Kind regards, Katrin
    Last edited by Katrin Wagner; 12 Feb 2017, 16:11.

  • #2
    Thanks for posting a data example. Take a look at the FAQ, there's a package called dataex which makes it easier to for others to get your data example. Anyhow, this code should do it:

    Code:
    bys date: egen newvar = sum(x1)

    Comment


    • #3
      Katrin:
      welcome to the list.
      I'm not entirely clear with your request.
      Anyway, you ma want to try:
      Code:
      bysort date: egen newvar=total(x1)
      Kind regards,
      Carlo
      (Stata 19.0)

      Comment


      • #4
        Note that #2 and #3 are really the same answer. The egen function name sum() went undocumented in Stata 9, but the syntax still works. The function total() has the same role.

        Comment


        • #5
          Nick:
          now I see, thanks.
          I started out with Stata 9.2.
          Kind regards,
          Carlo
          (Stata 19.0)

          Comment


          • #6
            Dear all,

            Katrin said
            For every date I would like to sum all values of x1 of the corresponding individuals (id).
            Should the code be like this in this case?

            Code:
            bysort id date: egen newvar=total(x1)

            Comment


            • #7
              Originally posted by Dias Rafaj View Post
              Dear all,

              Katrin said

              Should the code be like this in this case?

              Code:
              bysort id date: egen newvar=total(x1)
              No, the question is to "sum across observations for given date", not to "sum across observations for given date and (each) id".
              Ho-Chuan (River) Huang
              Stata 19.0, MP(4)

              Comment


              • #8
                Dias: River is right. Also, try your syntax. You will see that you just get the original values returned.

                Comment


                • #9
                  Dear all,

                  thank you very much for the warm welcome and your advice.
                  The bys commands with worked quiet well with sum() as well as with total() so far.

                  While cross checking with my calculations in excel, I rather think that stata is not exactly calculating what I am looking for.
                  The question from the beginning (#1) was in the context of the reconstruction of a mathematical formula in stata.

                  Here is a illustration of parts of the formula, which is related to my question.
                  • (Kit -125.5) is represented by x1 in my previous question (#1)
                  • N is the number of all IDs
                  • and t represents the date in numbers (from date -224 to date +5)
                  Click image for larger version

Name:	Bildschirmfoto 2017-02-13 um 12.30.29.png
Views:	2
Size:	26.2 KB
ID:	1374053


                  There might be an easier way to code this doubled sum in stata. I tried to find a solution with separate sum commands, but the result is slightly different from what it should be.
                  Is there a possibility to adjust the bys command to cover both sums and to code this formula in a simple way?

                  Thank you very much in advance!

                  Kind Regards,
                  Katrin

                  PS: thanks a lot also for the hint with the package in the FAQ section - this is great, I will try to improve my data illustration!






                  Attached Files

                  Comment


                  • #10
                    It looks like the inner sum is just a mean of x1 which is then squared. The outer sum appears to be a sum over a rolling window of time. If that's correct, you can use rangestat (from SSC) to perform that part. Here's an example using fake data:

                    Code:
                    * create 100 firms, each with obs for 500 time periods
                    clear
                    set seed 24234
                    set obs 100
                    gen long id = _n
                    expand 500
                    bysort id: gen t = _n
                    gen x1 = runiform()
                    
                    * mean per time period (squared)
                    bysort t (id): egen double m = mean(x1)
                    replace m = m * m
                    
                    * sum over a rolling window of time
                    by t: keep if _n == 1
                    rangestat (sum) m (count) m, interval(t -244 5)
                    replace m_sum = . if m_count < 250

                    Comment


                    • #11
                      Thank you very much for your help, Robert. I will test your code and try the rangestat command!
                      You are right, the inner sum is the mean of x1, but only the mean of every id on one day.

                      Comment


                      • #12
                        Robert Picard I note you kindly responded to Katrin Wagner with the syntax for coding the greek sigma -summation for variables. Am struggling to code a similar kind of summation for my variables. Please help out if you may. I attach my data using dataex

                        Code:
                        * Example generated by -dataex-. For more info, type help dataex
                        clear
                        input int(date investor_id) str6 qdate byte(K Dnkt Dnkt_1 Dmkt1) double(meanRawbar meanRawbart1) byte(Nkt Nkt_1)
                        22159 108 "2020q3" 4 1 0 0       .8      .75 2  2
                        22097 108 "2020q3" 4 1 1 1       .8      .75 1  1
                        22159  19 "2020q3" 4 1 1 1       .8      .75 2  2
                        22097  15 "2020q3" 4 0 1 1       .8      .75 1  4
                        22159  15 "2020q3" 4 1 1 1       .8      .75 1  3
                        21934  19 "2020q1" 6 0 1 1       .6 .5714286 5  7
                        21930  58 "2020q1" 6 0 1 1       .6 .5714286 5 18
                        21930 108 "2020q1" 6 1 1 1       .6 .5714286 4  7
                        21930  74 "2020q1" 6 0 0 0       .6 .5714286 4 14
                        21937 108 "2020q1" 6 1 0 0       .6 .5714286 1  1
                        21930 108 "2020q1" 6 0 1 1       .6 .5714286 2 11
                        21935  15 "2020q1" 6 1 1 1       .6 .5714286 4  7
                        21936  19 "2020q1" 6 1 0 0       .6 .5714286 1  9
                        21927 108 "2020q1" 6 0 1 1       .6 .5714286 5  7
                        21930  71 "2020q1" 6 0 1 1       .6 .5714286 5  7
                        21945  15 "2020q1" 6 1 1 1       .6 .5714286 5  7
                        21935  19 "2020q1" 6 1 0 0       .6 .5714286 2 11
                        21935 108 "2020q1" 6 1 1 1       .6 .5714286 2  3
                        21935  15 "2020q1" 6 1 0 0       .6 .5714286 2  3
                        21930  58 "2020q1" 6 1 0 0       .6 .5714286 4 11
                        17268  46 "2007q2" 8 1 1 1 .6470588     .625 3  4
                        17283  49 "2007q2" 8 0 . . .6470588     .625 1  .
                        17342  45 "2007q2" 8 1 0 0 .6470588     .625 2  6
                        17344 120 "2007q2" 8 1 . . .6470588     .625 1  .
                        17261  46 "2007q2" 8 1 . . .6470588     .625 5  .
                        17290 100 "2007q2" 8 0 . . .6470588     .625 5  .
                        17332  73 "2007q2" 8 1 . . .6470588     .625 5  .
                        17261 108 "2007q2" 8 1 0 0 .6470588     .625 2  6
                        17295  14 "2007q2" 8 0 1 1 .6470588     .625 3  4
                        17309  87 "2007q2" 8 1 0 0 .6470588     .625 3  4
                        
                        end
                        format %tddd-Mon-YY date
                        I also attach the equation as am not able to post the image in this stata editor. Will be very grateful. Here is the image of equation 2.docx

                        Comment

                        Working...
                        X