Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • "entropyetc" to calculate HHI Index

    Dear all,

    I am writing my masters thesis on the effect of market competition on risk-taking behavior of banks. My main explanatory variable will be a measure of market concentration, HHI index or the Lerner Index.

    I am working with the Call Report data for US banks which is a panel data set that has all the banks in the US per year (2000 - 2020) and I have generated the HHI Index using assets with the following code:

    entropyetc BankID [w=Assets], by(qdate) gen(4=HHI_Assets)

    qdate is a quarterly time variable indicating the year and the respective quarter

    After looking at similar literature on HHI values for deposits and assets, the calculated HHI values seem too low to be correct. For example here is the histogram of HHI for the 588,888 observations I have in the dataset:
    HHI_Assets.gph

    Here is also a little part of the data:
    Click image for larger version

Name:	Data.png
Views:	2
Size:	1.22 MB
ID:	1753064

    I don't see the mistake I've done with the formula I used above. I appreciate any input on this matter.

    Thank you,
    Viktor

    Attached Files

  • #2
    entropyetc is from SSC (FAQ Advice #12).

    Your sample data are given as screenshots and are incomplete.

    To check your understanding, create a toy example for which you can independently calculate what the HHI should be.

    Some people calculate this measure on a scale going up to 10000, but my command doesn't do that.

    Comment


    • #3
      Hi Nick,

      What do you mean by creating a toy example for calculating the HHI? I know a separate approach would be to take the value of the assets in a quarter and divide it by the total assets of the sample, and squaring that number to get "market share".

      Furthermore, even if your command doesn't scale up to 10000, would you still say that it is an accurate representation of the index?

      Thank you for your fast response,
      Viktor

      Comment


      • #4
        A toy example is any simple example you use or find. For example, one of the examples in the help is
        Code:
          . webuse grunfeld, clear
          . entropyetc company [w=invest], list by(year)
        So you can check the results using your own code or anybody else's.

        Your second paragraph is puzzling. My point is that I calculate HHI on a scale that goes up to 1 using proportions, but some people work with data scaled to percents, so their upper limit is 10000.

        Comment


        • #5
          My apologies for the 2nd paragraph, I should have been more clear. The question was more so regarding the interpretation of the numbers to try and help my understanding of the command as I have to explain my approach of calculating the HHI within the methodology.

          I do have a follow up question then on interpreting the calculated values for HHI for which I've used the bank assets. Since I have a very large dataset of banks, it pretty much includes all institutions that range from very large to rather small banks. The calculated values for HHI range from around 0.02 to 0.06 and since it scales up to 1, it still seems rather low to me as I would imagine all of these values would fall under the category of "low concentration", but I am probably interpreting this wrong.

          Comment


          • #6
            That looks like the same question to me. The results you get seem "rather low", so for your own peace of mind you should be doing some quantitative checks, such as writing your own code, if you distrust mine. What is the largest proportion of assets held by any single bank at any date? Suppose that individual big banks have the order of 10% = 0.1 of the total That is a contribition of 0.01 to HHI. A bank with say 1% = 0.01 will contribute 0.0001, and so on.

            Comment


            • #7
              Originally posted by Viktor Christian View Post
              I do have a follow up question then on interpreting the calculated values for HHI for which I've used the bank assets. Since I have a very large dataset of banks, it pretty much includes all institutions that range from very large to rather small banks. The calculated values for HHI range from around 0.02 to 0.06 and since it scales up to 1, it still seems rather low to me as I would imagine all of these values would fall under the category of "low concentration", but I am probably interpreting this wrong.
              The index is sometimes defined as ranging from 0(ish) to 10,000 rather than 0(ish) to 1. If that is what you expect, just multiply your variable by 10,000. Neither is "correct" or "incorrect": As long as the computation corresponds with your definition, than you do it correctly, and it they don't correspond, you do it incorrectly. The multiplication by 10,000 has no other purpose than to result in numbers that are easier for humans to work with. It is just like we could say that the distance between Konstanz and Durham is 108600000 cm or we could say that it is 1086 km. Both statements are correct, but km is in this case a unit that is easier to work with. So if you, and especially your intended audience, are used to working on the 10,000 scale, by all means, multiply your variable by 10,000.

              As an asside, I am horrible at counting 0s, so instead of coding replace HHI_Assets = HHI_Assets *10000 I would normally use replace HHI_Assets = HHI_Assets *1E4
              Last edited by Maarten Buis; 13 May 2024, 06:07.
              ---------------------------------
              Maarten L. Buis
              University of Konstanz
              Department of history and sociology
              box 40
              78457 Konstanz
              Germany
              http://www.maartenbuis.nl
              ---------------------------------

              Comment

              Working...
              X