Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to create a variable equal to the kurtosis?

    Hello Statalist community,

    I am currently struggling to achieve the following: I would like to create a new variable (called KURT5) which is equal to the kurtosis.
    More specifically: I have a panel dataset with 20 years of data on ~ 1,000 firms. The new variable KURT5 should be equal to the kurtosis of the observations of the focal firm in the current year (of the variable change_in_debt) and of the value of this variable for the same firm in the 5 years prior. So, KURT5 should be the kurtosis of 6 datapoints of one specific firm (six years from this specific company).

    The variable "change_in_debt" shows the percentage point difference from year to year of a company in their debt level. For example, in the table below, in 2001 firm Alpha increased their debt level by 5.0%

    An example:
    Company Year change_in_debt
    Alpha 2001 +5.0%
    Alpha 2002 +5.1%
    Alpha 2003 +5.0%
    Alpha 2004 +5.2%
    Alpha 2005 +5.1%
    Alpha 2006 +5.2%
    Beta 2001 +0.5%
    Beta 2002 +27.9%
    Beta 2003 +1.2%
    Beta 2004 +76.3%
    Beta 2005 +21.6%
    Beta 2006 +1.6%
    For each of the two firms above, I would like to calculate a kurtosis value:
    • For Firm Alpha, I would like the compute the kurtosis for the 6 datapoints from 2001-2006
    • Similar for firm Beta, I would like the compute the kurtosis for the 6 datapoints from 2001-2006
    By doing so, I hope to show that:
    • Firm A changes its debt level at a constant (regular) pace: Each year they change it by ~ 5.0%
    • In contrast, firm B changes its debt level at an irregular pace: In one year they have high peaks (e.g. in 2004 + 76.3%), while in other years there is almost no change at all (e.g. in 2001 +0.5%)
    Thus:
    • The constant pace of change of firm A means that it should have a relatively flat distribution. Thus, its kurtosis should be low
    • In contrast, the large peaks and the periods of inactivity of firm B means that it should have a relatively concentrated distribution. Thus, its kurtosis should be high
    Thank you so much in advance for any advice on how such a variable could be computed in Stata.

    Franz
    Last edited by Franz Hopp; 08 Sep 2019, 13:24.

  • #2
    There is probably a more elegant way to do this, but this should work using Robert Picard's and Clyde Schecter's runby.ado from SSC.
    Code:
    clear
    input str5 company yr pctchg
      Alpha 2001 5.0
      Alpha 2002 5.1
      Alpha 2003 5.0
      Alpha 2004 5.2
      Alpha 2005 5.1
      Alpha 2006 5.2
      Beta 2001 0.5
      Beta 2002 27.9
      Beta 2003 1.2
      Beta 2004 76.3
      Beta 2005 21.6
      Beta 2006 1.6
    end
    
    * Install/update Robert Picard's and Clyde Schecter's runby.ado from SSC
    ssc install runby, replace
    
    cap drop kurt5
    gen kurt5 = .
    
    prog drop kurtz
    prog def kurtz
      summarize pctchg, detail
      replace kurt5 = r(kurtosis)
    end
    
    runby kurtz, by(company) verbose
    
    list
    Red Owl
    Stata/IC 16.0 (Windows 10, 64-bit)

    Comment


    • #3
      rangestat (SSC) supports this directly.

      Comment


      • #4
        Red Owl,

        Thank you so much for your help and for the code you provided. This has been *incredibly* helpful to me!

        Yet I was wondering if one could add an additional element to your code: In the current version, the code computes one particular kurtosis value which is the same for all of the years of the company.

        However, would it also be possible to let the code know to compute the kurtosis based only on the current year, as well as on the 5 years prior? (I.e. to only consider the kurtosis values of the 6 datapoints). -- For instance, as in the example in the table below:
        • For firm Alpha, the kurtosis value (KURT5) for the year 2006 would be computed for the 6 datapoints relating to the six years 2001-2006
        • Similar, for firm Alpha, the kurtosis value (KURT5) for the year 2007 would be computed for the 6 datapoints relating to the six years 2002-2007
        • And finally, for firm Alpha, the kurtosis value (KURT5) for the year 2008 would be computed for the 6 datapoints relating to the six years 2003-2008
        For the years 2001-2005, for firm Alpha, there would be missing values for the kurtosis variable (KURT5).

        So I was wondering whether the code would also allow to compute this more-refined approach?

        For instance, for the example below (I added values for the years 2007 and 2008 for the two companies):
        Company Year change_in_debt
        Alpha 2001 +5.0%
        Alpha 2002 +5.1%
        Alpha 2003 +5.0%
        Alpha 2004 +5.2%
        Alpha 2005 +5.1%
        Alpha 2006 +5.2%
        Alpha 2007 +5.0%
        Alpha 2008 +5.1%
        Beta 2001 +0.5%
        Beta 2002 +27.9%
        Beta 2003 +1.2%
        Beta 2004 +76.3%
        Beta 2005 +21.6%
        Beta 2006 +1.6%
        Beta 2007 +89.8%
        Beta 2008 +0.2%

        Thank you so much in advance for any advice on how the Stata code could be updated!

        Franz
        Last edited by Franz Hopp; 10 Sep 2019, 19:12.

        Comment


        • #5
          Once again: rangestat is designed for this kind of calculation

          Comment


          • #6
            Nick -- thanks a lot, I think I got the command now -- thanks for pointing rangestat out again! Works well. Please excuse.

            Comment

            Working...
            X