How to create a variable equal to the kurtosis?

Franz Hopp

Join Date: Feb 2015
Posts: 42

How to create a variable equal to the kurtosis?

08 Sep 2019, 13:12

Hello Statalist community,

I am currently struggling to achieve the following: I would like to create a new variable (called KURT5) which is equal to the kurtosis.
More specifically: I have a panel dataset with 20 years of data on ~ 1,000 firms. The new variable KURT5 should be equal to the kurtosis of the observations of the focal firm in the current year (of the variable change_in_debt) and of the value of this variable for the same firm in the 5 years prior. So, KURT5 should be the kurtosis of 6 datapoints of one specific firm (six years from this specific company).

The variable "change_in_debt" shows the percentage point difference from year to year of a company in their debt level. For example, in the table below, in 2001 firm Alpha increased their debt level by 5.0%

An example:

Company	Year	change_in_debt
Alpha	2001	+5.0%
Alpha	2002	+5.1%
Alpha	2003	+5.0%
Alpha	2004	+5.2%
Alpha	2005	+5.1%
Alpha	2006	+5.2%
Beta	2001	+0.5%
Beta	2002	+27.9%
Beta	2003	+1.2%
Beta	2004	+76.3%
Beta	2005	+21.6%
Beta	2006	+1.6%

For each of the two firms above, I would like to calculate a kurtosis value:

For Firm Alpha, I would like the compute the kurtosis for the 6 datapoints from 2001-2006
Similar for firm Beta, I would like the compute the kurtosis for the 6 datapoints from 2001-2006

By doing so, I hope to show that:

Firm A changes its debt level at a constant (regular) pace: Each year they change it by ~ 5.0%
In contrast, firm B changes its debt level at an irregular pace: In one year they have high peaks (e.g. in 2004 + 76.3%), while in other years there is almost no change at all (e.g. in 2001 +0.5%)

Thus:

The constant pace of change of firm A means that it should have a relatively flat distribution. Thus, its kurtosis should be low
In contrast, the large peaks and the periods of inactivity of firm B means that it should have a relatively concentrated distribution. Thus, its kurtosis should be high

Thank you so much in advance for any advice on how such a variable could be computed in Stata.

Franz

Last edited by Franz Hopp; 08 Sep 2019, 13:24.

Tags: kurtosis

Red Owl

Join Date: Nov 2016
Posts: 127

08 Sep 2019, 13:50

There is probably a more elegant way to do this, but this should work using Robert Picard's and Clyde Schecter's runby.ado from SSC.

Code:

clear
input str5 company yr pctchg
  Alpha 2001 5.0
  Alpha 2002 5.1
  Alpha 2003 5.0
  Alpha 2004 5.2
  Alpha 2005 5.1
  Alpha 2006 5.2
  Beta 2001 0.5
  Beta 2002 27.9
  Beta 2003 1.2
  Beta 2004 76.3
  Beta 2005 21.6
  Beta 2006 1.6
end

* Install/update Robert Picard's and Clyde Schecter's runby.ado from SSC
ssc install runby, replace

cap drop kurt5
gen kurt5 = .

prog drop kurtz
prog def kurtz
  summarize pctchg, detail
  replace kurt5 = r(kurtosis)
end

runby kurtz, by(company) verbose

list

Red Owl
Stata/IC 16.0 (Windows 10, 64-bit)

Comment

Nick Cox

Join Date: Mar 2014

Posts: 35676
#3

08 Sep 2019, 14:54

rangestat (SSC) supports this directly.
1 like
Comment

Franz Hopp

Join Date: Feb 2015
Posts: 42

10 Sep 2019, 19:00

Red Owl,

Thank you so much for your help and for the code you provided. This has been *incredibly* helpful to me!

Yet I was wondering if one could add an additional element to your code: In the current version, the code computes one particular kurtosis value which is the same for all of the years of the company.

However, would it also be possible to let the code know to compute the kurtosis based only on the current year, as well as on the 5 years prior? (I.e. to only consider the kurtosis values of the 6 datapoints). -- For instance, as in the example in the table below:

For firm Alpha, the kurtosis value (KURT5) for the year 2006 would be computed for the 6 datapoints relating to the six years 2001-2006
Similar, for firm Alpha, the kurtosis value (KURT5) for the year 2007 would be computed for the 6 datapoints relating to the six years 2002-2007
And finally, for firm Alpha, the kurtosis value (KURT5) for the year 2008 would be computed for the 6 datapoints relating to the six years 2003-2008

For the years 2001-2005, for firm Alpha, there would be missing values for the kurtosis variable (KURT5).

So I was wondering whether the code would also allow to compute this more-refined approach?

For instance, for the example below (I added values for the years 2007 and 2008 for the two companies):

Company	Year	change_in_debt
Alpha	2001	+5.0%
Alpha	2002	+5.1%
Alpha	2003	+5.0%
Alpha	2004	+5.2%
Alpha	2005	+5.1%
Alpha	2006	+5.2%
Alpha	2007	+5.0%
Alpha	2008	+5.1%
Beta	2001	+0.5%
Beta	2002	+27.9%
Beta	2003	+1.2%
Beta	2004	+76.3%
Beta	2005	+21.6%
Beta	2006	+1.6%
Beta	2007	+89.8%
Beta	2008	+0.2%

Thank you so much in advance for any advice on how the Stata code could be updated!

Franz

Last edited by Franz Hopp; 10 Sep 2019, 19:12.

Comment

Nick Cox

Join Date: Mar 2014

Posts: 35676
#5

10 Sep 2019, 19:16

Once again: rangestat is designed for this kind of calculation
Comment
Franz Hopp

Join Date: Feb 2015

Posts: 42
#6

10 Sep 2019, 19:36

Nick -- thanks a lot, I think I got the command now -- thanks for pointing rangestat out again! Works well. Please excuse.
1 like
Comment

Announcement

How to create a variable equal to the kurtosis?

Comment

Comment

Comment

Comment

Comment