Constructing of weighted average variable by id

Marius Ipsen

Join Date: Mar 2016

Posts: 9
#1

Constructing of weighted average variable by id

22 Mar 2016, 06:42

I have a panel data set with data from parent companies and their subsidiaries. A parent company and its subsidiaries have a shared IDnumber and a individual IDnumber.
I want to contruct a variable that measures a weigthed average of the taxdifferences in the countries where the multinational operates:

C.i = 1/(1-T.i) * (sum of each subsidiary (profit.k/(1-T.k) (T.i-T.k))/sum (profit.k/(1-T.k) (See the Photo)

I believe it must be some kind of "if" command but i can't really figure it out. maybe something like the folowing:

gen C = 1/(1-Corperatetax) * sum(profit/(1-corperatetax.i?) (corperatetax - corperatax if (?) , by shared_ID)) ??

I really dont know how to make Work and hope someone can help me,

Kind regards Marius

1 Photo
Tags: data, panel, panel data
Nick Cox

Join Date: Mar 2014

Posts: 35727
#2

22 Mar 2016, 07:05

The photo is unreadable to me. Photos have one primary purpose here, to show a personal picture if you wish.

For such problems, post data examples showing clearly your variable names, data structure. Show a worked example with simple numbers.

http://www.statalist.org/forums/help#stata says all this and explains why and explains how to do better. Please have a careful look.
Comment
Carole J. Wilson

Join Date: Jan 2015

Posts: 932
#3

22 Mar 2016, 07:12

To get the weighted average, you can use a series of gen and egen commands with the bysort prefix. There are ways to get the same with fewer lines, but this example shows you the steps. (I've created some data, and in this particular example, the weighted average is the same as the mean of price b/c the frequency is constant within groups.)

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input int(price freq) float group 4082 20 0 10372 20 0 5189 20 0 4453 20 0 5788 20 0 7827 10 1 4816 10 1 3799 10 1 4749 10 1 4099 10 1 end gen priceXfreq= price*freq bysort group: egen N= total(freq) bysort group: egen sum=total(priceXfreq) gen weighted_avg=sum / N

Stata/MP 14.1 (64-bit x86-64)
Revision 19 May 2016
Win 8.1
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35727
#4

22 Mar 2016, 09:54

Carole gives excellent advice as usual. One variable there is dispensable, however:

Code:

bysort group: egen N = total(freq) by group: egen sum = total(price * freq) gen weighted_avg = sum / N

In fact, with a small risk of misunderstanding you can do this:

Code:

bysort group: egen N = total(freq) by group: egen weighted_avg = total(price * freq) replace weighted_avg = weighted_avg / N
1 like
Comment
Marius Ipsen

Join Date: Mar 2016

Posts: 9
#5

23 Mar 2016, 06:51

Thank you very much for your answer. I understand that I need to specify exactly what I want:

I have a dataset that looks like this:

tax group ID turnover c
0.40 0 1 4082 DK
0.20 0 2 5189 GB
0.30 0 3 5788 DE
0.30 1 4 4816 DE
0.40 1 5 4749 DK
0.20 1 6 4099 GB

now I need a variable that measures the incentive to shift profits for a given firm within groups.

For ID=1 it should be equal to: 5189*(0.4-0.2)+5788*(0.4-0.3)=1616.6
For ID=2 it should be equal to: 4082*(0.2-0.4)+5788*(0.2-0.3)=-1395.2

Im thinking something like this. i being the specific firm. j being another firm in the group (I know the syntax with "turnover.j" is not working, but just to show you what i am thinking):

bysort group: egen incentive = total(turnover.j*(tax.i-tax.j)

how do I keep tax.i constant whilst summing over the group?

Attached Files

example.dta (3.9 KB, 1 view)
Comment

Nick Cox

Join Date: Mar 2014
Posts: 35727

23 Mar 2016, 07:35

Thanks for the detail. This matches your examples. There may well be a better way to do it. This replaces the dreaded loop over observations with a loop over different members of each group.

Discussion of technique at http://www.stata-journal.com/sjpdf.h...iclenum=dm0055
http://www.stata-journal.com/article...article=dm0075

Code:

clear
input tax group ID turnover str2 c
0.40 0 1 4082 DK
0.20 0 2 5189 GB
0.30 0 3 5788 DE
0.30 1 4 4816 DE
0.40 1 5 4749 DK
0.20 1 6 4099 GB
end 
bysort group (ID) : gen which = _n
su which, meanonly

gen wanted = .

qui forval j = 1/`r(max)' {
   egen owntax = total(tax * (which == `j')), by(group)  
   egen work = total((owntax - tax) * turnover), by(group)
   replace wanted = work if which == `j'
   drop own* work
}

list, sepby(group)

     +------------------------------------------------------+
     | tax   group   ID   turnover    c   which      wanted |
     |------------------------------------------------------|
  1. |  .4       0    1       4082   DK       1      1616.6 |
  2. |  .2       0    2       5189   GB       2     -1395.2 |
  3. |  .3       0    3       5788   DE       3    110.7001 |
     |------------------------------------------------------|
  4. |  .3       1    4       4816   DE       1   -64.99993 |
  5. |  .4       1    5       4749   DK       2      1301.4 |
  6. |  .2       1    6       4099   GB       3     -1431.4 |
     +------------------------------------------------------+

Last edited by Nick Cox; 23 Mar 2016, 07:55.

Comment

Marius Ipsen

Join Date: Mar 2016

Posts: 9
#7

02 Apr 2016, 03:41

Thank you very much for this solution, it works perfectly. Can i condition this on a dummy?
I want two new variables. The same calculation with observations where dummy=1 and dummy=0.
wanted1 (dummy=1)is for example:

For ID=1 : 5788*(0.4-0.3)=578,8

wanted2 (dummy=0)is for example:

For ID=1 : 5189*(0.4-0.2)+6342*(0.4-0.5)=403

tax group ID turnover c Dummy
.4 0 1 4082 GB 0
.2 0 2 5189 DE 0
.5 0 3 6342 SE 0
.3 0 4 5788 DK 1
.3 1 5 4816 GB 0
.4 1 6 4749 DE 0
.2 1 7 4099 DK 1

Kind regards
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35727
#8

02 Apr 2016, 04:01

Surely. But as urged in #2 and exemplified by Carole and myself in answers, please read and act on the FAQ Advice to give us a data example using CODE delimiters that we can run straight off.
Comment

Nick Cox

Join Date: Mar 2014
Posts: 35727

02 Apr 2016, 07:17

Absent an example, I am guessing this is the kind of thing you want.

Code:

clear
input tax group ID turnover str2 c dummy 
.4 0 1 4082 GB 0
.2 0 2 5189 DE 0
.5 0 3 6342 SE 0
.3 0 4 5788 DK 1
.3 1 5 4816 GB 0
.4 1 6 4749 DE 0
.2 1 7 4099 DK 1
end 
egen group2 = group(group dummy) 
bysort group2 (ID) : gen which = _n
su which, meanonly

gen wanted = .

qui forval j = 1/`r(max)' {
   egen owntax = total(tax * (which == `j')), by(group2)  
   egen work = total((owntax - tax) * turnover), by(group2)
   replace wanted = work if which == `j'
   drop own* work
}

list, sepby(group2)


     +----------------------------------------------------------------------+
     | tax   group   ID   turnover    c   dummy   group2   which     wanted |
     |----------------------------------------------------------------------|
  1. |  .4       0    1       4082   GB       0        1       1   403.6001 |
  2. |  .2       0    2       5189   DE       0        1       2      -2719 |
  3. |  .5       0    3       6342   SE       0        1       3     1964.9 |
     |----------------------------------------------------------------------|
  4. |  .3       0    4       5788   DK       1        2       1          0 |
     |----------------------------------------------------------------------|
  5. |  .3       1    5       4816   GB       0        3       1     -474.9 |
  6. |  .4       1    6       4749   DE       0        3       2      481.6 |
     |----------------------------------------------------------------------|
  7. |  .2       1    7       4099   DK       1        4       1          0 |
     +----------------------------------------------------------------------+

Comment

Marius Ipsen

Join Date: Mar 2016

Posts: 9
#10

22 Apr 2016, 02:50

exactly what i meant. thanking very much
Comment

Announcement