Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Dear Nick,

    I meant that I was too close to find HHI for my class variable. I did it with a tag function. But tag does not report how many times a distinct value is being repeated, it only reports whether or not a value appears in the group before the focal observation which made my results for HHI wrong. I thought that the logic of this approach was close to Daniel's. Therefore, I found your warning ("If all categories are represented in all groups, then that would work") so relevant to my approach.

    How tag does:
    Firm year class tag
    Firm A 2000 105 1
    Firm A 2000 107 1
    Firm A 2000 107 0
    Firm A 2000 107 0

    What do I need to find HHI?
    Firm year class Any function
    Firm A 2000 105 1
    Firm A 2000 107 3
    Firm A 2000 107 0
    Firm A 2000 107 0
    Question here is: Is there any function in Stata which gives me a count how many times a distinct value is being repeated in the group?

    Best regards, Farid

    Comment


    • #32
      Code:
      help tabulate

      Comment


      • #33
        Dear all,

        i am using the HHI as a measure for the degree of specialization of individual employees. ​​​
        ​​​​For each individual i, i observed the number of years he or she previously worked in one industry-category, based on 2-3 employments.

        Based on your earlier posts, i undestand ​​​​​​​that the Herfindahl score shall be calculated by summing the squared shares of all experiences through the time period for each individual. However, i stuck here with the execution.
        person industry 1 tenure 1 industry 2 tenure 2 industry 3 tenure 3
        1 8881 3 8881 6
        2 7123 2 8881 5 9911 3
        3 6982 7 7493 6 3352 2

        Your help is highly appreciated!


        Comment


        • #34
          Dear Nick
          I am seeking similar assistance on HHI index based on total income and total assets per credit union in each country similar to that of Mohina. I would like to compute the hhi yearly for each credit union per island country. I would like to compute the concentration ratio per island for each credit union for each year. I am attaching the sample for st vincent only below but the entire file for all seven islands is attached.
          Island ID CU YEAR TOTINC TOTAST
          St Vincent 1 GECCU 2009 10,354,327 142,998,310
          St Vincent 1 GECCU 2010 10,398,687 149,047,458
          St Vincent 1 GECCU 2011 12,138,542 152,551,061
          St Vincent 1 GECCU 2012 11,270,081 159,944,699
          St Vincent 1 GECCU 2013 11,174,077 166,839,901
          St Vincent 1 GECCU 2014 11,797,644 179,320,697
          St Vincent 1 GECCU 2015 12,428,461 190,855,037
          St Vincent 1 GECCU 2016 13,578,058 210,379,156
          St Vincent 1 GECCU 2017 15,236,572 231,914,947
          St Vincent 1 GECCU 2018 16,442,983 258,826,523
          St Vincent 2 Kccu 2009 4,515,787 55,226,160
          St Vincent 2 Kccu 2010 4,709,019 56,533,034
          St Vincent 2 Kccu 2011 5,081,558 60,829,313
          St Vincent 2 Kccu 2012 5,342,034 65,331,538
          St Vincent 2 Kccu 2013 5,323,653 69,052,753
          St Vincent 2 Kccu 2014 5,763,040 78,659,710
          St Vincent 2 Kccu 2015 6,412,253 84,194,080
          St Vincent 2 Kccu 2016 6,606,917 91,959,291
          St Vincent 2 Kccu 2017 7,013,308 99,081,220
          St Vincent 2 Kccu 2018 7,391,305 103,275,280

          Attached Files

          Comment


          • #35
            #34 Good that you have found a relevant thread, but spreadsheet attachments are a no-no here (FAQ Advice #12). Code for your problem follows from other answers in this thread. That was the implicit answer to #33 too.

            Comment


            • #36
              YEAR CUID COID hhi_TOTAST hhi_TOTINC
              2009 1 1 1 1
              2010 1 1 1 1
              2011 1 1 1 1
              2012 1 1 1 1
              2013 1 1 1 1
              2014 1 1 1 1
              2015 1 1 1 1
              2016 1 1 1 1
              2017 1 1 1 1
              2018 1 1 1 1
              2009 2 1 1 1
              2010 2 1 1 1
              2011 2 1 1 1
              2012 2 1 1 1
              2013 2 1 1 1
              2014 2 1 1 1
              2015 2 1 1 1
              2016 2 1 1 1
              2017 2 1 1 1
              2018 2 1 1 1
              hhi TOTAST TOTINC, by(YEAR CUID COID) outfile replace

              The above command was what I used but I got all Ones . what am I doing wrong.


              Comment


              • #37
                carlton durrant You're struggling here but none of this is rocket surgery or brain science.

                The key advice here is simple. Please do read https://www.statalist.org/forums/help#stataThe idea is just: Show us a data example we can use (easily), or else we're entitled to shrug our shoulders and get back to the day job.

                Going against my personal rule I tried to look at your spreadsheet but my copy of Excel refuses to read it. That is the sort of experience that puts many people here off trying to look at someone else's spreadsheet files.

                Your #36 just shows results but the problem there in using hhi (from SSC) was already explained in #13 of this thread. hhi will necessarily return 1 for single observations as a single value is 100% of its own total and a proportion of 1, squared, is nothing but 1 again. .

                I don't know much about
                hhi, which I didn't write. I do know more about entropyetc (SSC), which I did write. But at most they are convenience commands for a simple calculation, given that this measure and most like it are defined by one line of algebra.

                I took your earlier data display -- which looks to me like copy-and-paste from Excel rather than a Stata listing -- and with some editing turned it into a listing of the kind we ask for here. The calculation is then divisible into (1) calculate the proportions you want (2) square them and add up the squares. That's all it is.

                If your variable names are different, then your code needs to be different accordingly.



                Code:
                * Example generated by -dataex-. To install: ssc install dataex
                clear
                input str10 island str5 cu int year long(totinc totast)
                "St Vincent" "GECCU" 2009 10354327 142998310
                "St Vincent" "GECCU" 2010 10398687 149047458
                "St Vincent" "GECCU" 2011 12138542 152551061
                "St Vincent" "GECCU" 2012 11270081 159944699
                "St Vincent" "GECCU" 2013 11174077 166839901
                "St Vincent" "GECCU" 2014 11797644 179320697
                "St Vincent" "GECCU" 2015 12428461 190855037
                "St Vincent" "GECCU" 2016 13578058 210379156
                "St Vincent" "GECCU" 2017 15236572 231914947
                "St Vincent" "GECCU" 2018 16442983 258826523
                "St Vincent" "Kccu"  2009  4515787  55226160
                "St Vincent" "Kccu"  2010  4709019  56533034
                "St Vincent" "Kccu"  2011  5081558  60829313
                "St Vincent" "Kccu"  2012  5342034  65331538
                "St Vincent" "Kccu"  2013  5323653  69052753
                "St Vincent" "Kccu"  2014  5763040  78659710
                "St Vincent" "Kccu"  2015  6412253  84194080
                "St Vincent" "Kccu"  2016  6606917  91959291
                "St Vincent" "Kccu"  2017  7013308  99081220
                "St Vincent" "Kccu"  2018  7391305 103275280
                end
                
                egen p = pc(totinc), by(island year) prop
                egen HHI = total(p^2), by(island year)
                
                tabdisp year island, c(HHI) format(%4.3f)
                
                ----------------------
                          |   Island  
                     YEAR | St Vincent
                ----------+-----------
                     2009 |      0.577
                     2010 |      0.571
                     2011 |      0.584
                     2012 |      0.564
                     2013 |      0.563
                     2014 |      0.559
                     2015 |      0.551
                     2016 |      0.560
                     2017 |      0.568
                     2018 |      0.572
                ----------------------
                Last edited by Nick Cox; 10 Jun 2020, 06:19.

                Comment


                • #38
                  Thanks Nick .I tried the command for all seven islands and it worked well. The only hurdle was converting the string data for income and assets to long format.
                  gen p = pc(totast), by(island year) prop

                  . egen HHI = total(p^2), by(island year)

                  . tabdisp year island, c(HHI) format(%4.3f)

                  --------------------------------------------------------------------------------------------------------------------------
                  | ISLAND
                  YEAR | Antigua Dominica Grenada Montserratt St Kitts Nevis St Vincent St. Lucia
                  ----------+---------------------------------------------------------------------------------------------------------------
                  2009 | 0.466 0.516 0.309 1.000 0.415 0.389 0.225
                  2010 | 0.462 0.551 0.313 1.000 0.415 0.387 0.221
                  2011 | 0.471 0.557 0.314 1.000 0.424 0.380 0.215
                  2012 | 0.481 0.567 0.316 1.000 0.426 0.376 0.210
                  2013 | 0.493 0.568 0.314 1.000 0.429 0.371 0.209
                  2014 | 0.495 0.556 0.314 1.000 0.425 0.364 0.207
                  2015 | 0.489 0.556 0.296 1.000 0.422 0.362 0.201
                  2016 | 0.500 0.552 0.295 1.000 0.415 0.358 0.192
                  2017 | 0.508 0.556 0.295 1.000 0.411 0.365 0.186
                  2018 | 0.527 0.540 0.297 1.000 0.399 0.366 0.181

                  Comment


                  • #39
                    Good, and thanks for the report.

                    Had you shown a Stata example of the kind requested we would have certainly explained about a need to destring.

                    For the record, in Stata long is a variable or storage type, not a (display) format. What terms are in use elsewhere is a different and small question.

                    Comment


                    • #40
                      Thank you all for the information in this valuable thread
                      Last edited by Huthayfa Nabeel; 12 Jul 2020, 07:15.

                      Comment


                      • #41
                        Dear Nick,

                        I have the same problem as #15. I followed the answers in this thread and in other threads. Unfortunately, the problem has not been solved yet.

                        I have panel data (n=1260 , t=9). I've tried to run
                        Code:
                        entropyetc ta , by( year country_code)
                        The variable country_code categorize the ta into 26 categories. Stata output still gives me
                        HTML Code:
                        too many values
                        r(134);
                        Any suggested solution wi bee highly appreciated.

                        Comment


                        • #42
                          Hi all,

                          This is my first post and I am doing my master thesis. For the analysing of my data I would like to calculate the Herfindahl-Hirschman Index for the following scenario. I would like to know the market concentration of certain healthcare companies on the basis of the number of clients. An example of the data is shown below:
                          Firm_name Region Numberofclients
                          Aafje Burghsluissingel Rotterdam 58
                          Aafje De Nieuwe Plantage Rotterdam 75
                          't Verlaet Zeeland 10
                          't Vonder Drenthe 29

                          Currently I used this code but the only result I get is the same number (1) for all the rows and (0) if data is missing

                          Code:
                          ssc instal hhi
                          hhi Numberofclients, by(Region Firm_name)

                          I hope someone can help me to find the right command so I will have the market concentration of every firm per region, based on the number of clients

                          Best regards,

                          Sanne Jansen


                          Comment


                          • #43
                            The data appear to be pairs of Region Firm_name so the sum of squared probabilities for each combination is identically 1 for non-missing values.

                            You can look at concentration by region or by firm, but not both.

                            Comment


                            • #44
                              Thank you Nick, this explains a lot, but what would you recommend me to do to calculate the market concentration of the firms per region, based on the number of clients? Should I make a new stata file per Region or are there other (smarter and faster) ways?

                              Comment


                              • #45
                                I didn't write hhi but my understanding is that your choices are limited to

                                Code:
                                hhi number, by(region)
                                
                                hhi number, by(firm_name)
                                and I don't see why a different dataset is thought to be needed.

                                Comment

                                Working...
                                X