Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Herfindahl Index over rolling window

    Dear STATA Users,

    I am working on a big firm level data. The firm (ID) year (Year) panel is unbalanced. I have information on firm sales by years and regions. That is, firms sell consignments to different regions over years. Each consignment has a unique batch code.

    I need to create, for each firm-year, Herfindahl Index (regional sale concentration) based on the number of consignments sold across different regions during the last three years.

    I have never worked with data where variables are constructed dynamically. Hence, the question. Any guidance will be much appreciated.

    Thank you,
    Mg

    A small data excerpt:
    ID Year Batch Code Region
    1 1981 100001 101
    1 1981 100002 101
    1 1981 100003 102
    1 1981 100004 102
    1 1981 100005 102
    1 1981 100006 103
    1 1981 100007 103
    1 1981 100008 103
    1 1981 100009 104
    1 1981 100010 107
    1 1981 100011 110
    1 1981 100012 111
    1 1982 100013 115
    1 1982 100014 116
    1 1982 100015 101
    1 1982 100016 101
    1 1982 100017 102
    1 1982 100018 102
    1 1982 100019 102
    1 1982 100020 103
    1 1982 100021 103
    1 1982 100022 103
    1 1982 100023 104
    1 1982 100024 107
    1 1982 100025 110
    1 1982 100026 111
    1 1982 100027 115
    1 1982 100028 116
    1 1983 100029 116
    1 1983 100030 116
    1 1983 100031 116
    1 1983 100032 111
    1 1983 100033 111
    1 1983 100034 111
    1 1983 100035 101
    1 1983 100036 101
    1 1984 100037 101
    1 1984 100038 101
    1 1984 100039 102
    1 1984 100040 102
    1 1984 100041 102
    1 1984 100042 103
    1 1984 100043 103
    1 1984 100044 103
    1 1984 100045 104
    1 1984 100046 107
    1 1984 100047 110
    1 1984 100048 111
    1 1984 100049 115
    1 1984 100050 116
    1 1984 100051 101
    1 1984 100052 101
    1 1984 100053 116
    1 1984 100054 116
    1 1984 100055 116
    1 1984 100056 111
    1 1984 100057 111
    1 1984 100058 111
    1 1984 100059 101
    1 1984 100060 101
    1 1984 100061 104
    1 1984 100062 107
    1 1984 100063 110
    1 1984 100064 111
    1 1984 100065 115
    1 1984 100066 116
    1 1985 100067 101
    1 1985 100068 101
    1 1985 100069 116
    1 1985 100070 116
    1 1985 100071 116
    1 1985 100072 111
    1 1985 100073 102
    1 1985 100074 102
    1 1985 100075 102
    1 1985 100076 103
    1 1985 100077 103
    1 1985 100078 103
    1 1985 100079 104
    1 1985 100080 107
    1 1985 100081 110
    1 1985 100082 111
    1 1985 100083 115
    1 1985 100084 116
    1 1985 100085 116
    1 1985 100086 116
    1 1985 100087 116
    1 1985 100088 111
    1 1985 100089 111
    1 1985 100090 111
    1 1985 100091 101
    1 1985 100092 101
    1 1985 100093 101
    1 1985 100094 101
    1 1985 100095 102
    2 1981 100096 111
    2 1981 100097 101
    2 1981 100098 101
    2 1981 100099 104
    2 1981 100100 107
    2 1981 100101 110
    2 1981 100102 111
    2 1981 100103 115
    2 1981 100104 116
    2 1981 100105 101
    2 1981 100106 101
    2 1981 100107 116
    2 1982 100108 116
    2 1982 100109 116
    2 1982 100110 111
    2 1982 100111 102
    2 1982 100112 102
    2 1982 100113 102
    2 1982 100114 103
    2 1982 100115 103
    2 1982 100116 103
    2 1982 100117 104
    2 1982 100118 107
    2 1982 100119 110
    2 1982 100120 111
    2 1982 100121 115
    2 1982 100122 116
    2 1982 100123 116
    2 1983 100124 116
    2 1983 100125 116
    2 1983 100126 111
    2 1983 100127 111
    2 1983 100128 111
    2 1983 100129 101
    2 1983 100130 101
    2 1983 100131 101
    2 1984 100132 111
    2 1984 100133 101
    2 1984 100134 101
    2 1984 100135 104
    2 1984 100136 107
    2 1984 100137 110
    2 1984 100138 111
    2 1984 100139 115
    2 1984 100140 116
    2 1984 100141 101
    2 1984 100142 101
    2 1984 100143 116
    2 1984 100144 116
    2 1984 100145 116
    2 1984 100146 111
    2 1984 100147 102
    2 1984 100148 102
    2 1984 100149 102
    2 1984 100150 103
    2 1984 100151 103
    2 1984 100152 103
    2 1984 100153 104
    2 1984 100154 107
    2 1984 100155 110
    2 1984 100156 111
    2 1984 100157 115
    2 1984 100158 116
    2 1984 100159 116
    2 1984 100160 116
    2 1984 100161 116
    2 1985 100162 111
    2 1985 100163 111
    2 1985 100164 111
    2 1985 100165 101
    2 1985 100166 101
    2 1985 100167 101
    2 1985 100168 111
    2 1985 100169 101
    2 1985 100170 101
    2 1985 100171 104
    2 1985 100172 107
    2 1985 100173 110
    2 1985 100174 111
    2 1985 100175 115
    2 1985 100176 116
    2 1985 100177 101
    2 1985 100178 101
    2 1985 100179 116
    2 1985 100180 116
    2 1985 100181 116
    2 1985 100182 111
    2 1985 100183 102
    2 1985 100184 102
    2 1985 100185 102
    2 1985 100186 103
    2 1985 100187 103
    2 1985 100188 103
    2 1985 100189 104
    2 1985 100190 107
    3 1985 100191 110
    3 1985 100192 111
    3 1985 100193 115
    3 1985 100194 116
    3 1985 100195 116
    3 1985 100196 116
    3 1985 100197 116
    3 1985 100198 111
    3 1985 100199 111
    3 1985 100200 111
    3 1985 100201 101
    3 1985 100202 101
    3 1985 100203 101
    3 1985 100204 111
    3 1985 100205 101
    3 1985 100206 101
    3 1985 100207 104
    3 1985 100208 107
    3 1985 100209 110
    3 1985 100210 111
    3 1985 100211 115
    3 1985 100212 116
    3 1985 100213 101
    3 1985 100214 101
    3 1985 100215 116
    3 1985 100216 116
    3 1985 100217 116
    3 1985 100218 111
    3 1985 100219 102
    3 1985 100220 102
    3 1985 100221 102
    3 1985 100222 103
    3 1985 100223 103
    3 1985 100224 103
    3 1985 100225 104
    3 1985 100226 107
    3 1985 100227 110
    3 1985 100228 111
    3 1985 100229 115
    3 1985 100230 116
    3 1985 100231 116
    3 1985 100232 116
    3 1985 100233 116
    3 1985 100234 111
    3 1985 100235 111
    3 1985 100236 111
    3 1985 100237 101
    3 1985 100238 101
    3 1985 100239 101
    3 1985 100240 111
    3 1985 100241 101
    3 1985 100242 101
    3 1985 100243 104
    3 1985 100244 107
    3 1985 100245 110
    3 1985 100246 111
    3 1985 100247 115
    3 1985 100248 116
    3 1985 100249 101
    3 1985 100250 111
    3 1985 100251 115
    3 1985 100252 116
    3 1985 100253 101
    3 1985 100254 101
    3 1985 100255 116
    3 1985 100256 116
    3 1985 100257 116
    3 1985 100258 111
    3 1985 100259 102
    3 1985 100260 102
    3 1985 100261 102
    3 1985 100262 103
    3 1985 100263 103
    3 1985 100264 103
    3 1985 100265 104
    3 1985 100266 107
    3 1985 100267 110
    3 1985 100268 111
    3 1985 100269 115
    3 1985 100270 116
    3 1985 100271 116
    3 1985 100272 116
    3 1985 100273 116
    3 1985 100274 111
    3 1985 100275 111
    3 1985 100276 111
    3 1985 100277 116
    3 1985 100278 115


  • #2
    Please study the FAQ including #12 and #18 carefully.

    (Batch Code isn't a legal variable name: please do use dataex as requested.)

    The Herfindahl index has many other names, including names for people who discovered or used it long before Herfindahl. I am taking it to be the sum of squared proportions.

    Here I would use entropyetc (SSC) to calculate the measure you want, based solely on number of consignments, if the problem didn't involve a moving window.

    I can still use that to check a one-off program written for rangerun (SSC). I don't show the checks I made, but I am happy.

    Code:
    clear
    
    input ID        Year    Batch_Code      Region
    
    <your data>
    
    
    capture program drop myherf
     
     program myherf
     local nobs = _N
     bysort Region: gen result = (_N/`nobs')^2 if _n == 1  
     su result, meanonly
     replace result = r(sum)
     end
    
    rangerun myherf, int(Year 0 0) by(ID) use(Region)
     
    tabdisp ID Year, c(result)
    
    ------------------------------------------------------------
              |                       Year                      
           ID |     1981      1982      1983      1984      1985
    ----------+-------------------------------------------------
            1 | .1805556  .1328125    .34375  .1333333  .1652794
            2 | .1944444  .1640625    .34375  .1444445  .1533888
            3 |                                          .161157
    ------------------------------------------------------------
    
    drop result
    
    rangerun myherf, int(Year -2 0) by(ID) use(Region)
     
    tabdisp ID Year, c(result)
    
    ------------------------------------------------------------
              |                       Year                      
           ID |     1981      1982      1983      1984      1985
    ----------+-------------------------------------------------
            1 | .1805556  .1428571  .1342593  .1358025  .1583872
            2 | .1944444   .130102  .1512346   .148834  .1521497
            3 |                                          .161157
    ------------------------------------------------------------
    The first table of results is for each ID and Year; the second table is for three year windows ending in the year specified. rangerun by default uses all the data available. (You didn't spell out very precisely how you wanted your windows defined, or indeed whether you wanted something different from that shown here, such as a three-year moving average.)

    That said, missing values would need more care.
    Last edited by Nick Cox; 03 Apr 2018, 06:00.

    Comment


    • #3
      Dear Nick,

      Thank you very much for the help!

      Also thanks for pointing out my mistake in not correctly labelling variables.

      Kind regards,
      Mg

      Comment


      • #4
        Dear Nick,

        Thank you again for the help! It did take some time, but it gave the desired output.

        I have one more related query: If I were to just obtain the sum of consignments dispatched region -wise for every firm over last two years, how one would continue. I tried the following:

        rangerun (count) batchcode, int(Year -2 0) by(ID) use(Region)

        It didn't work. It keeps saying -- invalid name. Then, I tried the function rangestat. It works only for

        rangestat (count) batchcode, int(Year -2 0) by(ID)

        but not for

        rangestat (count) batchcode, int(Year -2 0) by(ID) use(Region)

        I am just curious, how one could use this function where there are more than one grouping variables in addition to time var? I will highly appreciate a guidance.

        Kind regards,
        Mg
        Last edited by Gupta Manish; 06 Apr 2018, 06:53.

        Comment


        • #5
          rangerun is a command, not a function. Your syntax mixes together elements of the syntax for rangestat and for rangerun without being legal for either. The first thing rangerun expects to see is always the name of a program, which is just about the first thing that its help tells you.

          That said, a direct application of rangestat appears to be what you want:

          Code:
          clear 
          input ID    Year    Batch_Code    Region
          1    1981    100001    101
          1    1981    100002    101
          1    1981    100003    102
          1    1981    100004    102
          1    1981    100005    102
          1    1981    100006    103
          1    1981    100007    103
          1    1981    100008    103
          1    1981    100009    104
          1    1981    100010    107
          1    1981    100011    110
          1    1981    100012    111
          1    1982    100013    115
          1    1982    100014    116
          1    1982    100015    101
          1    1982    100016    101
          1    1982    100017    102
          1    1982    100018    102
          1    1982    100019    102
          1    1982    100020    103
          1    1982    100021    103
          1    1982    100022    103
          1    1982    100023    104
          1    1982    100024    107
          1    1982    100025    110
          1    1982    100026    111
          1    1982    100027    115
          1    1982    100028    116
          1    1983    100029    116
          1    1983    100030    116
          1    1983    100031    116
          1    1983    100032    111
          1    1983    100033    111
          1    1983    100034    111
          1    1983    100035    101
          1    1983    100036    101
          1    1984    100037    101
          1    1984    100038    101
          1    1984    100039    102
          1    1984    100040    102
          1    1984    100041    102
          1    1984    100042    103
          1    1984    100043    103
          1    1984    100044    103
          1    1984    100045    104
          1    1984    100046    107
          1    1984    100047    110
          1    1984    100048    111
          1    1984    100049    115
          1    1984    100050    116
          1    1984    100051    101
          1    1984    100052    101
          1    1984    100053    116
          1    1984    100054    116
          1    1984    100055    116
          1    1984    100056    111
          1    1984    100057    111
          1    1984    100058    111
          1    1984    100059    101
          1    1984    100060    101
          1    1984    100061    104
          1    1984    100062    107
          1    1984    100063    110
          1    1984    100064    111
          1    1984    100065    115
          1    1984    100066    116
          1    1985    100067    101
          1    1985    100068    101
          1    1985    100069    116
          1    1985    100070    116
          1    1985    100071    116
          1    1985    100072    111
          1    1985    100073    102
          1    1985    100074    102
          1    1985    100075    102
          1    1985    100076    103
          1    1985    100077    103
          1    1985    100078    103
          1    1985    100079    104
          1    1985    100080    107
          1    1985    100081    110
          1    1985    100082    111
          1    1985    100083    115
          1    1985    100084    116
          1    1985    100085    116
          1    1985    100086    116
          1    1985    100087    116
          1    1985    100088    111
          1    1985    100089    111
          1    1985    100090    111
          1    1985    100091    101
          1    1985    100092    101
          1    1985    100093    101
          1    1985    100094    101
          1    1985    100095    102
          2    1981    100096    111
          2    1981    100097    101
          2    1981    100098    101
          2    1981    100099    104
          2    1981    100100    107
          2    1981    100101    110
          2    1981    100102    111
          2    1981    100103    115
          2    1981    100104    116
          2    1981    100105    101
          2    1981    100106    101
          2    1981    100107    116
          2    1982    100108    116
          2    1982    100109    116
          2    1982    100110    111
          2    1982    100111    102
          2    1982    100112    102
          2    1982    100113    102
          2    1982    100114    103
          2    1982    100115    103
          2    1982    100116    103
          2    1982    100117    104
          2    1982    100118    107
          2    1982    100119    110
          2    1982    100120    111
          2    1982    100121    115
          2    1982    100122    116
          2    1982    100123    116
          2    1983    100124    116
          2    1983    100125    116
          2    1983    100126    111
          2    1983    100127    111
          2    1983    100128    111
          2    1983    100129    101
          2    1983    100130    101
          2    1983    100131    101
          2    1984    100132    111
          2    1984    100133    101
          2    1984    100134    101
          2    1984    100135    104
          2    1984    100136    107
          2    1984    100137    110
          2    1984    100138    111
          2    1984    100139    115
          2    1984    100140    116
          2    1984    100141    101
          2    1984    100142    101
          2    1984    100143    116
          2    1984    100144    116
          2    1984    100145    116
          2    1984    100146    111
          2    1984    100147    102
          2    1984    100148    102
          2    1984    100149    102
          2    1984    100150    103
          2    1984    100151    103
          2    1984    100152    103
          2    1984    100153    104
          2    1984    100154    107
          2    1984    100155    110
          2    1984    100156    111
          2    1984    100157    115
          2    1984    100158    116
          2    1984    100159    116
          2    1984    100160    116
          2    1984    100161    116
          2    1985    100162    111
          2    1985    100163    111
          2    1985    100164    111
          2    1985    100165    101
          2    1985    100166    101
          2    1985    100167    101
          2    1985    100168    111
          2    1985    100169    101
          2    1985    100170    101
          2    1985    100171    104
          2    1985    100172    107
          2    1985    100173    110
          2    1985    100174    111
          2    1985    100175    115
          2    1985    100176    116
          2    1985    100177    101
          2    1985    100178    101
          2    1985    100179    116
          2    1985    100180    116
          2    1985    100181    116
          2    1985    100182    111
          2    1985    100183    102
          2    1985    100184    102
          2    1985    100185    102
          2    1985    100186    103
          2    1985    100187    103
          2    1985    100188    103
          2    1985    100189    104
          2    1985    100190    107
          3    1985    100191    110
          3    1985    100192    111
          3    1985    100193    115
          3    1985    100194    116
          3    1985    100195    116
          3    1985    100196    116
          3    1985    100197    116
          3    1985    100198    111
          3    1985    100199    111
          3    1985    100200    111
          3    1985    100201    101
          3    1985    100202    101
          3    1985    100203    101
          3    1985    100204    111
          3    1985    100205    101
          3    1985    100206    101
          3    1985    100207    104
          3    1985    100208    107
          3    1985    100209    110
          3    1985    100210    111
          3    1985    100211    115
          3    1985    100212    116
          3    1985    100213    101
          3    1985    100214    101
          3    1985    100215    116
          3    1985    100216    116
          3    1985    100217    116
          3    1985    100218    111
          3    1985    100219    102
          3    1985    100220    102
          3    1985    100221    102
          3    1985    100222    103
          3    1985    100223    103
          3    1985    100224    103
          3    1985    100225    104
          3    1985    100226    107
          3    1985    100227    110
          3    1985    100228    111
          3    1985    100229    115
          3    1985    100230    116
          3    1985    100231    116
          3    1985    100232    116
          3    1985    100233    116
          3    1985    100234    111
          3    1985    100235    111
          3    1985    100236    111
          3    1985    100237    101
          3    1985    100238    101
          3    1985    100239    101
          3    1985    100240    111
          3    1985    100241    101
          3    1985    100242    101
          3    1985    100243    104
          3    1985    100244    107
          3    1985    100245    110
          3    1985    100246    111
          3    1985    100247    115
          3    1985    100248    116
          3    1985    100249    101
          3    1985    100250    111
          3    1985    100251    115
          3    1985    100252    116
          3    1985    100253    101
          3    1985    100254    101
          3    1985    100255    116
          3    1985    100256    116
          3    1985    100257    116
          3    1985    100258    111
          3    1985    100259    102
          3    1985    100260    102
          3    1985    100261    102
          3    1985    100262    103
          3    1985    100263    103
          3    1985    100264    103
          3    1985    100265    104
          3    1985    100266    107
          3    1985    100267    110
          3    1985    100268    111
          3    1985    100269    115
          3    1985    100270    116
          3    1985    100271    116
          3    1985    100272    116
          3    1985    100273    116
          3    1985    100274    111
          3    1985    100275    111
          3    1985    100276    111
          3    1985    100277    116
          3    1985    100278    115
          end 
          
          rangestat (count) Batch_Code, by(ID) int(Year -2 0) 
          
          tabdisp ID Year, c(Batch_Code_count)
          
          ----------------------------------------
                    |             Year            
                 ID | 1981  1982  1983  1984  1985
          ----------+-----------------------------
                  1 |   12    28    36    54    67
                  2 |   12    28    36    54    67
                  3 |                           88
          ----------------------------------------
          It seems a little odd that the results for ID 1 and 2 are identical, but it's a fact about the example data.

          Comment


          • #6
            Thank you very much, Nick.

            You are simply divine!

            I just tried:

            rangestat (count) Batch_Code, by(ID Region) int(Year -2 0)

            That is, putting two grouping variables in by() option. Is it logical?

            It did provide an output which I am to check now.

            Kind regards,
            Mg

            Comment


            • #7
              It's a different question, but could be what you want.

              Comment


              • #8
                Yes, it is. But on a quick check, it works. Thank you!

                Comment


                • #9
                  Originally posted by Nick Cox View Post

                  That said, missing values would need more care.
                  Hello,

                  I found the code in #2 very useful.
                  Code:
                  program myherf0
                   local nobs = _N
                   bysort IPC1v: gen result0 = (_N/`nobs')^2 if _n == 1  
                   sum result0, meanonly
                   replace result0 = r(sum)
                   end
                  rangerun myherf, int(Year 0 0) by(ID) use(Region)
                  Is there a way to use it if values are missing? (in this case in var "Region").
                  I understand that _n counts all obs, is there a way to modify the code such that missing values in Region will not be counted?

                  Thank you.
                  Last edited by Noa Fogel; 04 Dec 2020, 13:19.

                  Comment


                  • #10
                    Possibly

                    Code:
                    rangerun myherf0 if !missing(Region), int(Year 0 0) by(ID) use(Region)

                    Comment


                    • #11
                      Thank you very much, this worked.
                      (I just realized I pasted the rangerun from the original code and a slightly modified program-code above it, my apologies. Thank you for the solution).

                      Comment

                      Working...
                      X