Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculating extreme outliers (0.5 percentile and 99.5 percentile)

    I have time series data and I am trying to calculate/identify the 0.5 percentile and 99.5 percentile for a specific variable for each year. In this case, I have ROA values for banks from 1992-2015. I've tried using centile but the output of centile was different from sum, as proven why here.

    For this case, the data is simple. Just ROA values for each bank (about 10,000 observations per year) and the corresponding year.

    I'm not sure if pctile is appropriate for time series data, as I'd like to identify the .5 and 99.5 percentiles for each year.

    Any help would be greatly appreciated.



  • #2
    Extreme outliers and extreme percentiles are not one and the same.

    Comment


    • #3
      I have not used it so far, but, with regards to extreme values, I gather Nick Cox's extremes, a user-written program, may be helpful to you. Since it accepts "by" commands, you will probably get the estimations according to each year, shall you wish.
      Best regards,

      Marcos

      Comment


      • #4
        With time series if outliers exist then they are outliers in a time context, i.e. spikes with little or no consistency with values for similar times.

        Comment


        • #5
          The outstanding problem here is that the statement of the problem is unclear. Is there one ROA value per bank per year, for 10,000 banks, or are there 10,000 ROA values per bank per year? If the outliers are calculated across banks at a single point in time, the time-series nature of each individual bank's data is less important. (There is still much to be concerned about with the ultimate goal - the tails of the distribution do not necessarily constitute outliers, and winsorizing approaches to data analysis are not highly regarded.)

          Please review the Statalist FAQ linked to from the top of the page, as well as from the Advice on Posting link on the page you used to create your post, looking especially at sections 9-12 on how to best pose your question. It would be particularly helpful to post a small hand-made example, perhaps with just a few banks and a few years. In particular, please read FAQ #12 and use dataex and CODE delimiters when posting to Statalist.

          The more you help others understand your problem, the more likely others are to be able to help you solve your problem.

          Comment

          Working...
          X