Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Classification of enterprises into size classes

    Dear experts,

    I have a panel data set with 77 variables and about 57,000 observations for the years 2014 - 2018. Therefore, I use dummy variables for the independent variable firm size (small medium large). I want to use this to measure the impact on the effective tax rate (ETR).

    Regarding the size classification, unfortunately I am not sure if I have defined this logically correctly. Could you please check this?

    The classification should correspond to the following:

    (1) Small corporations are those that do not exceed at least two of the following three characteristics:

    1. 6 000 000 euros balance sheet total.
    2. turnover of 12,000,000 euros
    3. 50 employees.

    (2) Medium-sized corporations are those which exceed at least two of the three characteristics referred to in subsection (1) and do not exceed at least two of the following three characteristics each:

    1. 20 000 000 euros balance sheet total.
    2. 40 000 000 euros turnover
    3. 250 employees.

    (3) Large corporations are those which exceed at least two of the three characteristics referred to in paragraph 2.

    This is how i write the code in stata:

    Code:
    capture drop size
    
    generate size   = "small firms"  if ((turnover <= 12000000) & (total_assets <= 6000000)) | ((turnover <= 12000000) & (employees <= 50)) | ((total_assets <= 6000000) & (employees <= 50))
    replace  size   = "medium firms"  if ((turnover > 12000000) & (turnover <= 12000000) & (total_assets > 6000000) & (total_assets <= 20000000) )| ((turnover > 12000000) & (employees > 50) & (turnover <= 40000000) & (employees <=250)) | ((total_assets > 6000000) & (total_assets <= 20000000) & (employees > 50) & (employees <= 250 ))
    replace  size   = "large firms"  if ((turnover > 40000000) & (total_assets > 20000000)) | ((turnover > 40000000) & (employees > 250))|((total_assets > 20000000) & (employees > 250))
    
    by size, sort: summarize ETR
    
    encode size, generate (size_new)
    
    label variable size_new  "firm size"
    
    tab size_new
    
    numlabel, add 
    
    tab size_new, gen(firm_size)
    
    describe firm_size*
    
    tab size_new firm_size1
    tab size_new firm_size2
    tab size_new firm_size3
    
    rename (firm_size*) (large small medium)
    
    describe (large small medium)
    
    tabstat ETR, statistics (count mean median sd max min range) by(size_new)
    If I do the classification as above, 8945 observations cannot be assigned. This may be due to the fact that none of the criteria apply to these companies, right?

    Code:
    -> firm_size = 
    
        Variable |        Obs        Mean    Std. Dev.       Min        Max
    -------------+---------------------------------------------------------
             ETR |      8,945    27.15413    15.14579   4.25e-06   99.85857
    many thanks.

  • #2
    Hi Can,

    I am not sure I follow the following line:
    replace size = "medium firms" if ((turnover > 12000000) & (turnover <= 12000000) & (total_assets > 6000000) & (total_assets <= 20000000) )
    You can't have turnover>12 and also <=12, you will not find any medium-sized firms (unless I am mistaken)?

    Best,
    Rhys

    Comment


    • #3
      Hi Rhys, yes, you are absolutely right. The second turnover should have been 40000000. I have fixed it now, but there are still 6800 observations that cannot be allocated. What could be the reason for that?

      Comment


      • #4
        Hi Can,

        I am afraid it is difficult for me to see without taking a look at the data. I can't spot any other obvious typos in the code.

        I suggest you browse firms which are unclassified, look at their observation values and see how they ought to be defined. This should then help you refine your code.

        Best,
        Rhys

        Comment

        Working...
        X