Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Constructing of weighted average variable by id

    I have a panel data set with data from parent companies and their subsidiaries. A parent company and its subsidiaries have a shared IDnumber and a individual IDnumber.
    I want to contruct a variable that measures a weigthed average of the taxdifferences in the countries where the multinational operates:

    C.i = 1/(1-T.i) * (sum of each subsidiary (profit.k/(1-T.k) (T.i-T.k))/sum (profit.k/(1-T.k) (See the Photo)

    I believe it must be some kind of "if" command but i can't really figure it out. maybe something like the folowing:

    gen C = 1/(1-Corperatetax) * sum(profit/(1-corperatetax.i?) (corperatetax - corperatax if (?) , by shared_ID)) ??

    I really dont know how to make Work and hope someone can help me,

    Kind regards Marius

  • #2
    The photo is unreadable to me. Photos have one primary purpose here, to show a personal picture if you wish.

    For such problems, post data examples showing clearly your variable names, data structure. Show a worked example with simple numbers.

    http://www.statalist.org/forums/help#stata says all this and explains why and explains how to do better. Please have a careful look.

    Comment


    • #3
      To get the weighted average, you can use a series of gen and egen commands with the bysort prefix. There are ways to get the same with fewer lines, but this example shows you the steps. (I've created some data, and in this particular example, the weighted average is the same as the mean of price b/c the frequency is constant within groups.)


      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input int(price freq) float group
       4082 20 0
      10372 20 0
       5189 20 0
       4453 20 0
       5788 20 0
       7827 10 1
       4816 10 1
       3799 10 1
       4749 10 1
       4099 10 1
      end
      
      gen priceXfreq= price*freq
      bysort group: egen N= total(freq)
      bysort group: egen sum=total(priceXfreq)
      
      gen weighted_avg=sum / N






      Stata/MP 14.1 (64-bit x86-64)
      Revision 19 May 2016
      Win 8.1

      Comment


      • #4
        Carole gives excellent advice as usual. One variable there is dispensable, however:

        Code:
        bysort group: egen N = total(freq)
        by group: egen sum = total(price * freq)
        gen weighted_avg = sum / N

        In fact, with a small risk of misunderstanding you can do this:

        Code:
        bysort group: egen N = total(freq)
        by group: egen weighted_avg = total(price * freq)
        replace weighted_avg = weighted_avg / N

        Comment


        • #5
          Thank you very much for your answer. I understand that I need to specify exactly what I want:

          I have a dataset that looks like this:

          tax group ID turnover c
          0.40 0 1 4082 DK
          0.20 0 2 5189 GB
          0.30 0 3 5788 DE
          0.30 1 4 4816 DE
          0.40 1 5 4749 DK
          0.20 1 6 4099 GB

          now I need a variable that measures the incentive to shift profits for a given firm within groups.

          For ID=1 it should be equal to: 5189*(0.4-0.2)+5788*(0.4-0.3)=1616.6
          For ID=2 it should be equal to: 4082*(0.2-0.4)+5788*(0.2-0.3)=-1395.2

          Im thinking something like this. i being the specific firm. j being another firm in the group (I know the syntax with "turnover.j" is not working, but just to show you what i am thinking):

          bysort group: egen incentive = total(turnover.j*(tax.i-tax.j)

          how do I keep tax.i constant whilst summing over the group?

          Attached Files

          Comment


          • #6
            Thanks for the detail. This matches your examples. There may well be a better way to do it. This replaces the dreaded loop over observations with a loop over different members of each group.

            Discussion of technique at http://www.stata-journal.com/sjpdf.h...iclenum=dm0055
            http://www.stata-journal.com/article...article=dm0075

            Code:
            clear
            input tax group ID turnover str2 c
            0.40 0 1 4082 DK
            0.20 0 2 5189 GB
            0.30 0 3 5788 DE
            0.30 1 4 4816 DE
            0.40 1 5 4749 DK
            0.20 1 6 4099 GB
            end 
            bysort group (ID) : gen which = _n
            su which, meanonly
            
            gen wanted = .
            
            qui forval j = 1/`r(max)' {
               egen owntax = total(tax * (which == `j')), by(group)  
               egen work = total((owntax - tax) * turnover), by(group)
               replace wanted = work if which == `j'
               drop own* work
            }
            
            list, sepby(group)
            
                 +------------------------------------------------------+
                 | tax   group   ID   turnover    c   which      wanted |
                 |------------------------------------------------------|
              1. |  .4       0    1       4082   DK       1      1616.6 |
              2. |  .2       0    2       5189   GB       2     -1395.2 |
              3. |  .3       0    3       5788   DE       3    110.7001 |
                 |------------------------------------------------------|
              4. |  .3       1    4       4816   DE       1   -64.99993 |
              5. |  .4       1    5       4749   DK       2      1301.4 |
              6. |  .2       1    6       4099   GB       3     -1431.4 |
                 +------------------------------------------------------+
            Last edited by Nick Cox; 23 Mar 2016, 07:55.

            Comment


            • #7
              Thank you very much for this solution, it works perfectly. Can i condition this on a dummy?
              I want two new variables. The same calculation with observations where dummy=1 and dummy=0.
              wanted1 (dummy=1)is for example:

              For ID=1 : 5788*(0.4-0.3)=578,8

              wanted2 (dummy=0)is for example:

              For ID=1 : 5189*(0.4-0.2)+6342*(0.4-0.5)=403


              tax group ID turnover c Dummy
              .4 0 1 4082 GB 0
              .2 0 2 5189 DE 0
              .5 0 3 6342 SE 0
              .3 0 4 5788 DK 1
              .3 1 5 4816 GB 0
              .4 1 6 4749 DE 0
              .2 1 7 4099 DK 1

              Kind regards

              Comment


              • #8
                Surely. But as urged in #2 and exemplified by Carole and myself in answers, please read and act on the FAQ Advice to give us a data example using CODE delimiters that we can run straight off.

                Comment


                • #9
                  Absent an example, I am guessing this is the kind of thing you want.

                  Code:
                  clear
                  input tax group ID turnover str2 c dummy 
                  .4 0 1 4082 GB 0
                  .2 0 2 5189 DE 0
                  .5 0 3 6342 SE 0
                  .3 0 4 5788 DK 1
                  .3 1 5 4816 GB 0
                  .4 1 6 4749 DE 0
                  .2 1 7 4099 DK 1
                  end 
                  egen group2 = group(group dummy) 
                  bysort group2 (ID) : gen which = _n
                  su which, meanonly
                  
                  gen wanted = .
                  
                  qui forval j = 1/`r(max)' {
                     egen owntax = total(tax * (which == `j')), by(group2)  
                     egen work = total((owntax - tax) * turnover), by(group2)
                     replace wanted = work if which == `j'
                     drop own* work
                  }
                  
                  list, sepby(group2)
                  
                  
                       +----------------------------------------------------------------------+
                       | tax   group   ID   turnover    c   dummy   group2   which     wanted |
                       |----------------------------------------------------------------------|
                    1. |  .4       0    1       4082   GB       0        1       1   403.6001 |
                    2. |  .2       0    2       5189   DE       0        1       2      -2719 |
                    3. |  .5       0    3       6342   SE       0        1       3     1964.9 |
                       |----------------------------------------------------------------------|
                    4. |  .3       0    4       5788   DK       1        2       1          0 |
                       |----------------------------------------------------------------------|
                    5. |  .3       1    5       4816   GB       0        3       1     -474.9 |
                    6. |  .4       1    6       4749   DE       0        3       2      481.6 |
                       |----------------------------------------------------------------------|
                    7. |  .2       1    7       4099   DK       1        4       1          0 |
                       +----------------------------------------------------------------------+

                  Comment


                  • #10
                    exactly what i meant. thanking very much

                    Comment

                    Working...
                    X