Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Moving average by groups

    Dear all,

    I'm using Stata 13.1 in Windows 7
    Is there any way to calculate the average of sales (sale) growth for the five years prior to the focal year (fyear) for each sector (sic), which includes all firms (gvkey) in this sector?

    I'm using the natural log of sales rather than raw sales
    I used the following codes:
    Code:
    sort gvkey fyear
    bys gvkey: gen logsale= ln(sale)
    bys gvkey: gen salegrow = (logsale-L.logsale)/L.logsale
    bysort sic fyear: egen sum = sum(salegrow)
    bys sic fyear: egen avrg = filter(sum), coef(1 1 1 1 1) lags(-1/-5) normalise
    However, the last code returns wrong message that "by" is not allowed
    I also tried:
    Code:
    bys sic fyear: gen avrg= mean(L.sum + L2.sum + L3.sum + L4.sum + L5.sum)
     and bys sic fyear: gen avrg= sum(L.sum + L2.sum + L3.sum + L4.sum + L5.sum)/5
    But this returns message that the data is not sorted

    Many thanks in advance

  • #2
    -rangestat-,written by Robert Picard, Roberto Ferrer, and Nick Cox, available from SSC (-ssc install rangestat-) can do this. Although newly written, it will run under version 13.1.

    Comment


    • #3
      Note that

      Code:
      bys sic fyear: gen avrg= mean(L.sum + L2.sum + L3.sum + L4.sum + L5.sum)
      isn't legal; it's just fantasy syntax as there is no function mean() that will work with generate. (In any case, the mean of a sum would just be that sum.)

      In your calls to sum() you're getting, confused as the function sum() gives a cumulative sum, not a sum observation-wise across its arguments.

      Note that calls to time series operators such as L. depend on your tsset or xtset settings, which are presumably in terms of gvkey fyear. They aren't operators that give "previous" values ad hoc given a by: framework.

      We don't have a data example to work on here.

      The egen function filter() is from egenmore (SSC), as you are asked to explain.

      So far, so problematic. But I copied an example from a previous thread of yours and invented some of the variables.

      As Clyde rightly suggests, you need to cut free of time series operators as you are pooling panels. rangestat (SSC) can do that. I think this is closer to what you want.


      Code:
      * http://www.statalist.org/forums/forum/general-stata-discussion/general/1339046-counting-number-of-occurrence
      
      clear
      input long gvkey double fyear float salegrow
      1004 2000 1.53331
      1004 2004 1.4996146
      1004 2005 .7088411
      1004 2010 .3146414
      1034 2001 .6643823
      1034 2004 1.438849
      1177 2006 1.4590694
      1177 2008 10.963183
      1230 2005 .010626324
      1230 2006 .003674926
      1240 2003 3.358209
      1327 2000 .9239014
      1327 2008 .8595901
      1356 2010 .443017
      1380 2007 .03508391
      1408 2004 .10378818
      1410 2000 .45337495
      1410 2001 .6643804
      1410 2005 3.181818
      1410 2006 45.97728
      1447 2000 .437056
      1447 2003 1.8477948
      1447 2005 1.442142
      1598 2001 .22388804
      1598 2005 .20615706
      1598 2006 .32949865
      1598 2010 .5923743
      1602 2003 3.2002046
      1602 2004 .4901359
      1602 2009 3.426707
      1602 2010 .6736618
      1632 2000 .5291719
      1632 2001 1.45055
      1632 2006 5.993407
      1632 2007 4.4055
      1655 2001 0
      1655 2006 .031516187
      1659 2003 .7550189
      1661 2003 .2155679
      1661 2004 .1606082
      1661 2009 4.1167808
      1663 2001 1.0567801
      1663 2003 .8047867
      1677 2002 2.3654432
      1678 2001 .2394754
      1678 2007 .07381995
      1678 2008 .16232003
      1686 2006 1.6086416
      1686 2010 17.664667
      end
      
      gen sic = gvkey < 1400
      
      rangestat mean = salegrow, interval(fyear -5 -1) by(sic)
      sort sic fyear gvkey
      list, sepby(sic fyear)
      
           +--------------------------------------------+
           | gvkey   fyear   salegrow   sic        mean |
           |--------------------------------------------|
        1. |  1410    2000    .453375     0           . |
        2. |  1447    2000    .437056     0           . |
        3. |  1632    2000   .5291719     0           . |
           |--------------------------------------------|
        4. |  1410    2001   .6643804     0   .47320095 |
        5. |  1598    2001    .223888     0   .47320095 |
        6. |  1632    2001    1.45055     0   .47320095 |
        7. |  1655    2001          0     0   .47320095 |
        8. |  1663    2001    1.05678     0   .47320095 |
        9. |  1678    2001   .2394754     0   .47320095 |
           |--------------------------------------------|
       10. |  1677    2002   2.365443     0   .56163075 |
           |--------------------------------------------|
       11. |  1447    2003   1.847795     0   .74201199 |
       12. |  1602    2003   3.200205     0   .74201199 |
       13. |  1659    2003   .7550189     0   .74201199 |
       14. |  1661    2003   .2155679     0   .74201199 |
       15. |  1663    2003   .8047867     0   .74201199 |
           |--------------------------------------------|
       16. |  1408    2004   .1037882     0   .94956619 |
       17. |  1602    2004   .4901359     0   .94956619 |
       18. |  1661    2004   .1606082     0   .94956619 |
           |--------------------------------------------|
       19. |  1410    2005   3.181818     0   .83322362 |
       20. |  1447    2005   1.442142     0   .83322362 |
       21. |  1598    2005   .2061571     0   .83322362 |
           |--------------------------------------------|
       22. |  1410    2006   45.97728     0   1.0226966 |
       23. |  1598    2006   .3294986     0   1.0226966 |
       24. |  1632    2006   5.993407     0   1.0226966 |
       25. |  1655    2006   .0315162     0   1.0226966 |
       26. |  1686    2006   1.608642     0   1.0226966 |
           |--------------------------------------------|
       27. |  1632    2007     4.4055     0   4.0419887 |
       28. |  1678    2007     .07382     0   4.0419887 |
           |--------------------------------------------|
       29. |  1678    2008     .16232     0   3.9348714 |
           |--------------------------------------------|
       30. |  1602    2009   3.426707     0   4.5833309 |
       31. |  1661    2009   4.116781     0   4.5833309 |
           |--------------------------------------------|
       32. |  1598    2010   .5923743     0   5.4581221 |
       33. |  1602    2010   .6736618     0   5.4581221 |
       34. |  1686    2010   17.66467     0   5.4581221 |
           |--------------------------------------------|
       35. |  1004    2000    1.53331     1           . |
       36. |  1327    2000   .9239014     1           . |
           |--------------------------------------------|
       37. |  1034    2001   .6643823     1   1.2286057 |
           |--------------------------------------------|
       38. |  1240    2003   3.358209     1   1.0405312 |
           |--------------------------------------------|
       39. |  1004    2004   1.499615     1   1.6199507 |
       40. |  1034    2004   1.438849     1   1.6199507 |
           |--------------------------------------------|
       41. |  1004    2005   .7088411     1    1.569711 |
       42. |  1230    2005   .0106263     1    1.569711 |
           |--------------------------------------------|
       43. |  1177    2006   1.459069     1    1.280087 |
       44. |  1230    2006   .0036749     1    1.280087 |
           |--------------------------------------------|
       45. |  1380    2007   .0350839     1   1.2112692 |
           |--------------------------------------------|
       46. |  1177    2008   10.96318     1    1.064246 |
       47. |  1327    2008   .8595901     1    1.064246 |
           |--------------------------------------------|
       48. |  1004    2010   .3146414     1   2.0057242 |
       49. |  1356    2010    .443017     1   2.0057242 |
           +--------------------------------------------+
      Last edited by Nick Cox; 20 May 2016, 02:52.

      Comment


      • #4
        Many thanks Clyde and Nick
        Nick, you are right, I’m using xtset gvkey fyear
        I’ve tried to use the code as advised. However, it shows the mean of years 2001-2005 although there is a requirement that the average of sales growth in each sector should be calculated using sales growth in this sector in the last five years before the focal year. So, the average of sales growth in 2005 should be the average sales growth of the sector in prior 5 years (i.e., 200, 2001, 2002, 2003, and 2004), which is not the case.
        What I want is:
        1. Calculate average sales growth in each sector and in each year by averaging sales growth of all firms belong to this sector.
        2. Calculate the five-year average in sales growth in each sector in the focal year by using the five years sales growth in this sector before the focal year.
        Many thanks

        Comment


        • #5
          Mohamed: You make a strong statement but don't back it up. Here is an independently produced average for a sector:

          Code:
           
          . su saleg if sic == 0 & inrange(fyear, 2000,2004) 
          
              Variable |        Obs        Mean    Std. Dev.       Min        Max
          -------------+---------------------------------------------------------
              salegrow |         18    .8332236    .8657407          0   3.200205
          0.833.... is the value shown for 2005 for sic 0 in my previous.



          Comment


          • #6
            Many thanks Nick and sorry for my wrong conclusion.
            I really appreciate the help from all people in this forum and the valuable time they spend to help others. Much appreciated!

            Comment

            Working...
            X