Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Ranking/categorizing firms based on the value of a variable in a specific year

    Hi

    I am working with a panel data set with c300 firms from years 2008 to 2014, with several firm-level variables such as sales, cash flow etc

    I would like to rank firms by their sales in 2010 and then sort firms into categories (large and small) based on the value of sales in the year 2010. ie. if the firm's value for sales is in the top half/third/quartile in the year 2010, it takes the value one (large), and zero otherwise (small).

    In short, I'm looking to identify large firms by looking at the top quartile/third/half of sales in 2010, and classify remaining firms as small.

    Any questions or if something isn't clear, please ask, and thanks in advance,
    Paul

  • #2
    Here is a toy data set.
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str32 company int year double sales
    "Cool"      2010 20490.637
    "Cool"      2011  20051.42
    "Cool"      2012 18936.417
    "Cool"      2013 17732.983
    "Cool"      2014 17644.755
    "Enigmatic" 2010  11052.81
    "Enigmatic" 2011 11928.681
    "Enigmatic" 2012 12487.099
    "Enigmatic" 2013  12774.29
    "Enigmatic" 2014 13170.071
    "Good"      2010 20327.829
    "Good"      2011 18464.601
    "Good"      2012  17173.71
    "Good"      2013  16742.04
    "Good"      2014 16984.502
    "Marvelous" 2010 15939.801
    "Marvelous" 2011 16113.851
    "Marvelous" 2012 16450.882
    "Marvelous" 2013 17005.342
    "Marvelous" 2014 18097.074
    end
    The following code will generate rank variable.
    Code:
    bysort year: egen rank_sales = rank(- sales)
    Verify!
    Code:
    sort year rank_sales
    list
    Code:
        +-----------------------------------------+
         |   company   year       sales   rank_s~s |
         |-----------------------------------------|
      1. |      Cool   2010   20490.637          1 |
      2. |      Good   2010   20327.829          2 |
      3. | Marvelous   2010   15939.801          3 |
      4. | Enigmatic   2010    11052.81          4 |
      5. |      Cool   2011    20051.42          1 |
         |-----------------------------------------|
      6. |      Good   2011   18464.601          2 |
      7. | Marvelous   2011   16113.851          3 |
      8. | Enigmatic   2011   11928.681          4 |
      9. |      Cool   2012   18936.417          1 |
     10. |      Good   2012    17173.71          2 |
         |-----------------------------------------|
     11. | Marvelous   2012   16450.882          3 |
     12. | Enigmatic   2012   12487.099          4 |
     13. |      Cool   2013   17732.983          1 |
     14. | Marvelous   2013   17005.342          2 |
     15. |      Good   2013    16742.04          3 |
         |-----------------------------------------|
     16. | Enigmatic   2013    12774.29          4 |
     17. | Marvelous   2014   18097.074          1 |
     18. |      Cool   2014   17644.755          2 |
     19. |      Good   2014   16984.502          3 |
     20. | Enigmatic   2014   13170.071          4 |
         +-----------------------------------------+
    I hope this is what you wanted.


    Please provide examples using dataex. Read the FAQs. https://www.statalist.org/forums/help
    12.2 What to say about your data



    We can understand your dataset only to the extent that you explain it clearly.

    The best way to explain it is to show an example. The community-contributed command dataex makes it easy to give simple example datasets in postings. It was written to support Statalist and its use is strongly recommended. Usually a copy of 20 or so observations from your dataset is enough to show your problem. See help dataex for details.
    Last edited by Budu Gulo; 11 Apr 2019, 12:46.

    Comment


    • #3
      Thanks Budo.

      So that ranks the firms in order of sales.

      Then, to create a dummy for firms in, for example, the top quartile of sales (assuming the no. of firms is 100), I would simply:

      Code:
      gen topquartilesales = 0
      replace topquartilesales = 1 if sales_rank < 25

      But is there a quick way to do this for just one specific year, eg 2010?

      In that I am looking to categorise firms based on their rank of sales in just one year (2010) as opposed to for all years.

      Thanks again

      Comment


      • #4
        I don't understand what you mean by "quick way", but the following code should give you what you want
        Code:
        keep if year == 2010
        alternatively,
        Code:
        drop if year!=2010

        Comment

        Working...
        X