summation of multiple variables with different subscripts

Christoph Zust

Join Date: Jul 2021

Posts: 3
#1

summation of multiple variables with different subscripts

01 Jul 2021, 08:53

Hello everybody

This is my very first post and I gave my best in order to stick to the advice on posting to Statalist. Please excuse in advance if there should nevertheless be some inaccuracies.

I am currently trying to construct the following measure of import exposure between two periods for my BA thesis (based on Autor, Dorn & Hanson (2013): "The China Syndrome: Local Labor Market Effects of Import Competition in the United States"):

"L" denotes the number of employees (given in full-time equivalents)

"Delta(Imp)" denotes the difference in imports from China to Switzerland between two periods

Subscript "i" denotes a region; "j" an industry, "t" a period

--> for example L_ijt is the number of employees in region i, industry j at the beginning of period t. Following this logic, "Delta(Imp)" is on the industry level.

My data set looks as following:

Code:

input float year int(comzone nace2) float(l_1 l_2 l_3) long tradeflow float d_trade05 2005 1 5812 502.5827 768.1605 275862.34 1 26727504 2005 1 2059 135.52043 4980.311 275862.34 0 0 2005 1 2593 73.693985 2244.5874 275862.34 0 3555463 2005 1 5811 507.7866 2068.2964 275862.34 0 -12758331 2005 1 1101 30.48767 397.5343 275862.34 1 53194 2005 1 1712 171.6538 1999.2538 275862.34 1 0 2005 1 2573 21.25865 8300.515 275862.34 0 10659572 2005 1 9004 764.4638 2234.022 275862.34 1 0 2005 1 1071 295.9666 8137.673 275862.34 1 0 2005 1 2894 9.184786 6182.562 275862.34 1 28994800 2005 1 3220 9.937362 485.2864 275862.34 0 98052 2005 1 5310 2974.012 34733.26 275862.34 0 151012 2005 1 1623 541.0589 27948.59 275862.34 0 21660954 2005 1 2822 292.46902 8554.805 275862.34 1 14217534 2005 1 2016 3.310062 3043.33 275862.34 1 12876245 end label values tradeflow tradeflow label def tradeflow 0 "Export", modify label def tradeflow 1 "Import", modify

where:
comzone denotes regions (# = 106)

nace2 denotes a 4digit industry code

l_1 is constructed in a way that it equals L_ij

l_2 equals L_jt

l_3 equals L_it

d_trade05 the Delta(Imp_jt)

So far I have tried this:

Code:

gen numerator = 0 levelsof nace2, local(nace2) foreach l of local nace2{ replace numerator = (l_1 * d_trade05) / (l_2 * l_3) if nace2 == `l' & tradeflow == 1 }

To my understanding, this creates all the single terms. My main problem now is, how to sum up all the single terms so I get the final measure of trade exposure which is at the regional level.

I have thought about continuing like that:

Code:

gen imp_exp05 = 0 forvalues i = 1/106{ replace imp_exp05 = total(numerator) if comzone == `i' }

But

Code:

total

does not work with

Code:

replace

. Does anybody have an idea how to solve this problem? Maybe, is there an easier solution or have I made a mistake in general?

Thank you very much for your help in advance.
Christoph
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30122
#2

01 Jul 2021, 10:04

See -help egen- and look at the -total()- function. -total- works with -egen-, not with -gen- or -replace-. Moreover, you do not use it in a loop over values of region like that. It's a single command:

Code:

by comzone year, sort: egen imp_exp = total(numerator)

I've deliberately named the variable imp_exp, rather than imp_exp_05 because this will compute the result for every year.

(And, yes, in case your wondering, the calculation of numerator itself could also have been simplified by the use of a -by- instead of a loop. But that's water over the dam now.)

I recommend you read -help by-, and then click on the blue link near the top of that page and read the entire chapter on -by- in the PDF manuals that are installed with your Stata. -by- is a "bread and butter" command in Stata and a real workhorse. While there are limits to what it can apply to, when it can be used, it is simpler and much more efficient than the kind of looping you are trying to use. In fact, in Stata, whenever you feel the urge to write a loop, you should always stop and ask yourself "could I do this using -by- instead?"
Comment
Christoph Zust

Join Date: Jul 2021

Posts: 3
#3

01 Jul 2021, 12:50

Dear Clyde

Thank you very much for your answer! It all worked now. I was aware of the -by- command but I did not know that it can be used for a summation like this. Also, I was not sure whether I was on the right track with my approach to construct the measure for trade exposure in general. Therefore, another big thank you for taking the time and thinking through my question.

Best regards
Christoph
Comment

Announcement

summation of multiple variables with different subscripts

Comment

Comment