Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Conditional loop analysis with sums in panel data

    I have an unbalanced data of

    company id
    (i)
    country code
    (c)
    year
    (y)
    ratio-1
    (r1)
    ratio-2
    (r2)
    ratio-3
    (r3)
    ratio-4
    (r4)
    ratio-5
    (r5)
    1 ≤ i ≤ 1587 1 ≤ c ≤ 18 2005 ≤ y ≤ 2021 0 ≤ r1 ≤1 0 ≤ r2 ≤1 0 ≤ r3 ≤1 0 ≤ r4 ≤1 0 ≤ r5 ≤1


    where 1587 companies from 18 different countries report 5 different financial ratios annually from 2005 to 2021.

    As such, there are 26979 rows (1587 * 17) in the data (1587 companies and 17 years) in which each company is assigned to a single country in all years.

    I want to generate a new variable (Xa,y) for each and every row. Since I couldn't properly post the formula for Xa,y I attached it in a pdf file Xa,y.pdf

    Eventually, all companies should have a unique "Xa,y" for each year and 17 (number of years) different "Xa.y"s in total.

    I appreciate your help with the code for Xa,y.

    Best,

    Lütfi

  • #2
    This may be a viable approach. Please do double check.

    Code:
    clear
    set obs 3
    gen country = _n
    expand 5
    bysort country: gen company = _n
    expand 17
    bysort country company: gen year = 2004 + _n
    foreach x in 1 2 3 4 5{
        gen r`x' = runiform()
    }
    save temp00, replace
    
    *-------------------------------------------------------------------------------
    * Reshape all the 5 index into a column
    reshape long r, i(country company year) j(index)
    save temp01, replace
    * Rename a couple variables and save a copy
    rename company company_merged
    rename r r_merged
    save temp02, replace
    
    * Join using joinby to create all possible company pairs within country, within
    * year, within index
    use temp01
    joinby country year index using temp02
    * Delete cases for identical companies (because a != b)
    drop if company == company_merged
    
    * Generate the innermost part of the results
    gen dr_sq = (r - r_merged)^2 / 5
    * Sum up by index level (k)
    collapse (sum) dr_sq, by(country company company_merged year)
    
    * Create the secondary summation term
    gen sum_root = 1 - sqrt(dr_sq)
    * Sum up by company pairs
    collapse (sum) sum_root, by(country company year)
    
    * Create the total count of companies in each country each year
    egen uniq_cou_com = tag(country company year)
    egen total_com = total(uniq_cou_com), by(country year)
    
    * Generate the x(a,y)
    gen wanted = 1/(total_com - 1) * sum_root
    
    * Save the results
    keep country company year wanted
    save temp03, replace
    
    * Merge back to the original data
    use temp00, clear
    merge 1:1 country company year using temp03
    
    * Clean up
    erase temp01.dta
    erase temp02.dta
    erase temp03.dta

    Comment


    • #3
      Many thanks for your reply.
      I tried several times but I always met with a memory problem that I couldn't solve.
      Without diving into the capacity of my Stata and computer, would you please suggest a memory-friendly approach (e.g., using the "forval" function amongst others) to the question?
      Anyway, your code above is appreciated a lot.

      Comment


      • #4
        In particular, when I type
        "joinby country year index using temp02"
        I receive the following error message:
        "op. sys. refuses to provide memory r(909);"
        Could there be a memory-friendly (alternative) code to derive the required new variable (Xa,y) Xa,y.pdf from this dataset?
        Attached Files

        Comment


        • #5
          Nevertheless, it worked well when I divided the data into different files.
          Many thanks again, Ken.

          Comment

          Working...
          X