Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generating advanced variable in Stata (panel data)

    Hi all,
    I am working with a panel data set (Orbis by Bureau van Dijk) containing information on millions of firms (mostly data from financial statements such as profits, fixed assets, turnover etc.).
    Furthermore, the data contains information about the ultimate owner of each company, which allows me to group firms by their group ultimate owner (GUO). Each GOU owns potentially many different companies in different countries.
    I am looking to generate a variable involving summations of variables within the GUO. Specifically, I want to generate the following variable, where B_{k,t} denotes estimated profit of firm k at time t.
    Click image for larger version

Name:	c_i.PNG
Views:	1
Size:	13.6 KB
ID:	1485547


    The tax variable is the corporate tax rate facing firm i at time t.

    I have really been trying on my own, even generating the different parts individually. But the part regarding keeping tax rate i at time t constant while iterating over the tax rates in other countries at time t is causing me trouble. I am refering to the difference between the tax rates in the nominator.

    Thank you in advance. This might not be an easy task - at least I don't think so.

    Kind regards,
    Jonas

  • #2
    You'll increase your chances of a useful answer by following the FAQ on asking questions - provide Stata code in code delimiters, readable Stata output, and sample data using dataex. With this, someone might code this for you - it is the kind of challenge that attracts folks.

    The easiest way to do a summation dropping a specif value (like i) is to use egen to calculate the total and then subtract the current value. However, this doesn't work easily because you have tax it - tax kt. It might be that reshaping the data wide would make this easier.

    Comment


    • #3
      Thank you Phil, I'll try again. Below i have included a data example (only for only 2 ultimate owner groups, hence GUO_id 1 and 2). I have only included the variables of interest in the example.
      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input int year float(company_id GUO_id trueprof_i tax_i)
      2003 25132 1  1808   .35
      2003 41371 1  3483 .3433
      2003 48784 1  1009 .3433
      2003 15393 1 11236 .3958
      2004 41371 1  3875 .3433
      2004 48784 1  1297 .3433
      2004 25132 1  1368   .35
      2004 15393 1  6259 .3829
      2005 15393 1  -122 .3831
      2005 18227 1  5298 .3831
      2005 48784 1   917 .3383
      2005 25132 1  1050   .35
      2005 41371 1  4182 .3383
      2006 25132 1   956   .35
      2006 48784 1   892 .3333
      2006 18227 1  6415 .3834
      2006 41371 1  5329 .3333
      2007 15393 1  -174 .3836
      2007 25132 1  1313  .325
      2007 41371 1  5773 .3333
      2007 18227 1  7267 .3836
      2007 48784 1   168 .3333
      2007 17446 1  3462 .3836
      2008 15393 1    69 .2951
      2008 17446 1  4279 .2951
      2008 48784 1   157 .3333
      2008 18227 1  8203 .2951
      2008 41371 1  3579 .3333
      2009 17446 1     . .2944
      2009 15389 1 30384 .2944
      2009 15441 1  2492 .2944
      2009 18227 1  7434 .2944
      2009 41371 1  4171 .3333
      2009 48784 1    90 .3333
      2010 41371 1  3190 .3333
      2010 15456 1  -229 .2941
      2010 17446 1     . .2941
      2010 15393 1  1052 .2941
      2010 15441 1  2317 .2941
      2011 15393 1  1433 .2937
      2011 18227 1  8062 .2937
      2011 15441 1  2465 .2937
      2011 25132 1  1379    .3
      2011 41371 1  3315 .3333
      2011 15389 1 21005 .2937
      2011 17446 1     . .2937
      2012 15441 1  2530 .2948
      2012 15393 1   886 .2948
      2012 15389 1 22514 .2948
      2004 11234 2  9459   .28
      2006 87872 2   812   .19
      2007 11234 2 20749   .24
      2008 11234 2 18499   .21
      2009 11234 2 14703    .2
      2010 87872 2  2416   .19
      2010 11234 2 20633   .19
      2011 87872 2  2087   .19
      2011 11234 2 22310   .19
      2012 11234 2 22745   .19
      end
      format %ty year

      I am not experienced with Stata, so my code is likely not the smartest way. As I explained above, i have been trying to generate the individual parts and then creating the C-variable. This is a long proces as one GUO owns 684 firms located in different European countries in one specific year.

      Code:
      xtset company_id year, yearly
      
      forv k=1/684{
      qbys GUO_id year: gen num`k'_1 = (trueprof_i[`k']/(1-tax_i[`k']))*(tax_i-tax_i[`k'])
      }
      forv k=1/684{
      qbys GUO_id year: gen denom`k'_1 = trueprof_i[`k']/(1-tax_i[`k'])
      }
      egen numerator = rowtotal(num*)
      egen denominator = rowtotal(denom*)
      g C_var = (trueprof_i/(1-tax_i))*(numerator/denominator)
      Hopefully, someone can see if im doing something wrong (compared to the mathematical expression in my first post) and ideally help me optimize this (not so good looking) code!
      Last edited by Jonas Offerlin; 27 Feb 2019, 13:19.

      Comment

      Working...
      X