Conditional loop analysis with sums in panel data

Lutfi Ozturker

Join Date: Apr 2017
Posts: 40

Conditional loop analysis with sums in panel data

28 Jun 2022, 13:30

I have an unbalanced data of

company id (i)	country code (c)	year (y)	ratio-1 (r1)	ratio-2 (r2)	ratio-3 (r3)	ratio-4 (r4)	ratio-5 (r5)
1 ≤ i ≤ 1587	1 ≤ c ≤ 18	2005 ≤ y ≤ 2021	0 ≤ r1 ≤1	0 ≤ r2 ≤1	0 ≤ r3 ≤1	0 ≤ r4 ≤1	0 ≤ r5 ≤1

where 1587 companies from 18 different countries report 5 different financial ratios annually from 2005 to 2021.

As such, there are 26979 rows (1587 * 17) in the data (1587 companies and 17 years) in which each company is assigned to a single country in all years.

I want to generate a new variable (Xa,y) for each and every row. Since I couldn't properly post the formula for Xa,y I attached it in a pdf file Xa,y.pdf

Eventually, all companies should have a unique "Xa,y" for each year and 17 (number of years) different "Xa.y"s in total.

I appreciate your help with the code for Xa,y.

Best,

Lütfi

Tags: conditional summation, loop analysis, panel data

Ken Chui

Join Date: Aug 2014
Posts: 1063

29 Jun 2022, 07:53

This may be a viable approach. Please do double check.

Code:

clear
set obs 3
gen country = _n
expand 5
bysort country: gen company = _n
expand 17
bysort country company: gen year = 2004 + _n
foreach x in 1 2 3 4 5{
    gen r`x' = runiform()
}
save temp00, replace

*-------------------------------------------------------------------------------
* Reshape all the 5 index into a column
reshape long r, i(country company year) j(index)
save temp01, replace
* Rename a couple variables and save a copy
rename company company_merged
rename r r_merged
save temp02, replace

* Join using joinby to create all possible company pairs within country, within
* year, within index
use temp01
joinby country year index using temp02
* Delete cases for identical companies (because a != b)
drop if company == company_merged

* Generate the innermost part of the results
gen dr_sq = (r - r_merged)^2 / 5
* Sum up by index level (k)
collapse (sum) dr_sq, by(country company company_merged year)

* Create the secondary summation term
gen sum_root = 1 - sqrt(dr_sq)
* Sum up by company pairs
collapse (sum) sum_root, by(country company year)

* Create the total count of companies in each country each year
egen uniq_cou_com = tag(country company year)
egen total_com = total(uniq_cou_com), by(country year)

* Generate the x(a,y)
gen wanted = 1/(total_com - 1) * sum_root

* Save the results
keep country company year wanted
save temp03, replace

* Merge back to the original data
use temp00, clear
merge 1:1 country company year using temp03

* Clean up
erase temp01.dta
erase temp02.dta
erase temp03.dta

Comment

Lutfi Ozturker

Join Date: Apr 2017

Posts: 40
#3

30 Jun 2022, 06:46

Many thanks for your reply.
I tried several times but I always met with a memory problem that I couldn't solve.
Without diving into the capacity of my Stata and computer, would you please suggest a memory-friendly approach (e.g., using the "forval" function amongst others) to the question?
Anyway, your code above is appreciated a lot.
Comment
Lutfi Ozturker

Join Date: Apr 2017

Posts: 40
#4

30 Jun 2022, 17:31

In particular, when I type
"joinby country year index using temp02"
I receive the following error message:
"op. sys. refuses to provide memory r(909);"
Could there be a memory-friendly (alternative) code to derive the required new variable (Xa,y) Xa,y.pdf from this dataset?
Attached Files

Xa,y.pdf (212.8 KB, 3 views)
Comment
Lutfi Ozturker

Join Date: Apr 2017

Posts: 40
#5

01 Jul 2022, 11:37

Nevertheless, it worked well when I divided the data into different files.
Many thanks again, Ken.
Comment

Announcement

Conditional loop analysis with sums in panel data

Comment

Comment

Comment

Comment