Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating Proportion with Positive and Negative Values

    I am working with yearly panel data on firms in several different countries. These firms include a mix of financial sector firms and many non-financial firms. Currently, I am trying to aggregate this firm-level data to the country-level to examine country-level trends. My ultimate goal is to create a ratio of the financial sector's profits as a percentage of all of the profits for a country year so I can compare how profitable countries' financial sectors are.

    This has proven a bit more difficult than I thought. I first created a dummy variable for financial sector firms (called "fin_sec"). I then used egen to sum all of the pre-tax income (pi) by year and country, used the same command to sum just financial sector income, and then divided the two. My code looked like this:

    egen all_profits = total(pi), missing by (year country)
    egen fin_profits = total(pi) if fin_sec==1, missing by (year country)
    gen fin_share_profits = fin_profits/all_profits

    This mostly seemed to work, but it created some strange outliers that point to problems. Basically, in years where the financial sector does uniformly bad (or good I suppose) things can get out of wack. Moreover, if net profits and financial profits are both negative, they create a positive ratio. Consider the case of the Netherlands in 2008 (the same year of the financial crisis). Net income for all firms summed (in millions) to -276 dollars. The financial sector, which suffered badly that year, summed -39586 making the financial share of profits 143 (!). This gives the impression the financial sector did extremely (i.e impossibly) well, but clearly this is not correct.

    So, my question is, with panel data, how do correctly sum up a variable with positive and negative values (income) and create a proportion?

    I've considered moving averages, but it doesn't address this more core problem as shown in the Dutch case. My only other thought was to take absolute values to sum everything up, but then I would lose the critical distinction of whether profits are positive or negative.

  • #2
    Just an aside about your code:
    Code:
    egen fin_profits = total(pi) if fin_sec==1, missing by (year country)
    does not do what I suspect you think it does. While it sums up only the financial sector profits within each year-country combination, it also only places the results in those observations where fin_sec == 1. You probably need to have this in all of your observations for subsequent analysis. To get the total of financial sector profits, but have it placed in every observation, the code would be:

    Code:
    egen fin_profits = total(cond(fin_sec == 1, pi, .)) , missing by (year country)
    Your question is really a substantive economics/econometrics/finance question, not a statistical one. It is beyond my expertise and I can't advise you on it. I hope somebody else will.

    Comment


    • #3
      Thank you Clyde. You're right about the code. I've created a separate country-level dta file and then just copy the aggregated values over. It's a bit of a pain, but it doesn't take too long. Thank you for the updated code though.

      Comment

      Working...
      X