Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating quarterly data and geting the average of quarterly data

    Dear all,
    I am trying to work on quarterly data. I am facing issue with
    !- how to create quarterly date. Although I have created quarterly date as you see in the sample of may dataset below. but an not sure whether its correct or not.
    So, I need the exacte code for this task.

    HTML Code:
    input long code str10 endingdateofstatistics double sales float(sl wanted) double interestexpenses
    1 "2002-03-31"   595998313  2118441344 5143   494251206
    1 "2002-06-30"  1315435885   595998336 5173  1057705003
    1 "2002-09-30"  2120786642  1315435904 5204  1763105625
    1 "2002-12-31"  3077098847  2120786688 5235  2346094441
    1 "2003-03-31"   649968516  3077098752 5265   671811790
    1 "2003-06-30"  1428220006   649968512 5295  1342495241
    1 "2003-09-30"  2734822115  1428220032 5326  1320285705
    1 "2003-12-31"  3128836264  2734822144 5356  2730921859
    1 "2004-03-31"  1043003427  3128836352 5387   836718769
    1 "2004-06-30"  2181888038  1043003456 5417  1829149395
    1 "2004-09-30"  3243275024  2181888000 5448  2731005816
    1 "2004-12-31"  4480810798  3243275008 5478  3683960935
    1 "2005-03-31"  1087591314  4480811008 5508   848176037
    1 "2005-06-30"  2258465172  1087591296 5539  1856542363
    1 "2005-09-30"  3431091962  2258465280 5569  2776622010
    1 "2005-12-31"  4557763312  3431091968 5600  3746666810
    1 "2006-03-31"  1262157098  4557763072 5630  1121451048
    1 "2006-06-30"  3020095556  1262157056 5660  2376141819
    1 "2006-09-30"  4878583091  3020095488 5691  3732502499
    1 "2006-12-31"  6788985745  4878583296 5722  5068772930
    1 "2007-03-31"  2244966243  6788985856 5752  1438988613
    1 "2007-06-30"  4711940000  2244966144 5782  3187131000
    1 "2007-09-30"  7363680000  4711940096 5813  5491449000
    1 "2007-12-31" 10291531000  7363680256 5843  8438051000
    1 "2008-03-31"  3366156000 10291530752 5874  3288918000
    1 "2008-06-30"  6797707000  3366156032 5904  6892908000
    1 "2008-09-30" 10240882000  6797706752 5935 10440326000
    1 "2008-12-31" 13563220000 10240881664 5965 13867376000
    1 "2009-03-31"  3523844000 13563219968 5995  2573409000
    1 "2009-06-30"  6902083000  3523844096 6026  4807758000
    1 "2009-09-30" 10434314000  6902083072 6056  6859029000
    1 "2009-12-31" 14293863000 10434314240 6087  9001138000
    1 "2010-03-31"  3956690000 14293863424 6117  2324413000
    1 "2010-06-30"  8232540000  3956689920 6147  4770383000
    1 "2010-09-30" 12714440000  8232540160 6178  7352523000
    1 "2010-12-31" 17561832000 12714439680 6209 10422598000
    1 "2011-03-31"  5617613000 17561831424 6239  4115507000
    1 "2011-06-30" 11644324000  5617612800 6269  9148239000
    1 "2011-09-30" 20204827000 11644323840 6300 17318474000
    1 "2011-12-31" 29087326000 20204826624 6330 27040923000
    1 "2012-03-31"  9434413000 29087326208 6361 10411417000
    1 "2012-06-30" 19001393000  9434413056 6391 21070247000
    1 "2012-09-30" 28794838000 19001393152 6422 31614058000
    1 "2012-12-31" 38911398000 28794836992 6452 41578323000
    1 "2013-03-31"  1.0526e+10 38911397888 6482  1.1539e+10
    1 "2013-06-30"  2.2969e+10  1.0526e+10 6513  2.4698e+10
    1 "2013-09-30"  3.6564e+10  2.2969e+10 6543  3.8431e+10
    1 "2013-12-31"  5.1294e+10  3.6564e+10 6574  5.2414e+10
    1 "2014-03-31"   1.480e+10  5.1294e+10 6604  1.6354e+10
    1 "2014-06-30"  3.2412e+10    1.48e+10 6634  3.2396e+10
    1 "2014-09-30"  5.1372e+10  3.2412e+10 6665  4.9152e+10
    1 "2014-12-31"  7.0637e+10  5.1372e+10 6696  6.6156e+10
    1 "2015-03-31"  2.0563e+10  7.0637e+10 6726   1.732e+10
    1 "2015-06-30"  4.4901e+10  2.0563e+10 6756  3.4746e+10
    1 "2015-09-30"  6.8545e+10  4.4901e+10 6787  5.1019e+10
    1 "2015-12-31"  9.2705e+10  6.8545e+10 6817   6.555e+10
    1 "2016-03-31"   2.657e+10  9.2705e+10 6848  1.3961e+10
    1 "2016-06-30"  5.2718e+10   2.657e+10 6878  2.7372e+10
    1 "2016-09-30"  7.8824e+10  5.2718e+10 6909  4.1012e+10
    1 "2016-12-31" 1.04416e+11  7.8824e+10 6939  5.4708e+10
    1 "2017-03-31"  2.7081e+10 1.04416e+11 6969  1.6176e+10
    1 "2017-06-30"  5.3204e+10  2.7081e+10 7000  3.4287e+10
    1 "2017-09-30"  7.8773e+10  5.3204e+10 7030  5.3393e+10
    1 "2017-12-31" 1.04869e+11  7.8773e+10 7061  7.4059e+10
    1 "2018-03-31"  2.7357e+10 1.04869e+11 7091  2.2257e+10
    1 "2018-06-30"  5.5484e+10  2.7357e+10 7121  4.4572e+10
    1 "2018-09-30"  7.8378e+10  5.5484e+10 7152  6.7132e+10
    1 "2018-12-31" 1.06212e+11  7.8378e+10 7183  8.8143e+10
    1 "2019-03-31"  3.0351e+10 1.06212e+11 7213  2.1887e+10
    1 "2019-06-30"  6.2069e+10  3.0351e+10 7243  4.3472e+10
    1 "2019-09-30"  9.4166e+10  6.2069e+10 7274  6.5544e+10
    1 "2019-12-31" 1.26814e+11  9.4166e+10 7304  8.7588e+10
    1 "2020-03-31"  3.4483e+10 1.26814e+11 7335  2.3107e+10
    1 "2020-06-30"  7.0071e+10  3.4483e+10 7365  4.4681e+10
    1 "2020-09-30" 1.07455e+11  7.0071e+10 7396  6.5627e+10
    1 "2020-12-31" 1.43242e+11 1.07455e+11 7426  8.7537e+10
    1 "2021-03-31"  3.8271e+10 1.43242e+11 7456  2.2308e+10
    1 "2021-06-30"  7.6835e+10  3.8271e+10 7487  4.5471e+10
    1 "2021-09-30" 1.15108e+11  7.6835e+10 7517  6.9126e+10
    1 "2021-12-31" 1.53503e+11 1.15108e+11 7548   9.320e+10
    1 "2022-03-31"  4.0852e+10 1.53503e+11 7578  2.4389e+10
    1 "2022-06-30"   8.019e+10  4.0852e+10 7608   4.873e+10
    1 "2022-09-30" 1.19851e+11   8.019e+10 7639  7.3142e+10
    1 "2022-12-31" 1.60469e+11 1.19851e+11 7670  9.8748e+10
    2- when I want to get the mean of sales based on firm id and quarters I couldn't
    Can any one give the code for this task. for example I need to find the mean of sales in this data based on quarters.


    3- after i get the average sales I need to keep only one observation for per firm. please I need the code . but each firm should has the average sales I calculate in the step 2 above

  • #2
    With your data example (thanks, except please use # not <> in the forum software to show CODE) I did this:

    Code:
    . gen qdate = qofd(daily(endingdateofstatistics, "YMD"))
    
    . format qdate %tq
    
    .
    . format wanted %td
    
    .
    . list qdate wanted
    
         +--------------------+
         |  qdate      wanted |
         |--------------------|
      1. | 2002q1   30jan1974 |
      2. | 2002q2   01mar1974 |
      3. | 2002q3   01apr1974 |
      4. | 2002q4   02may1974 |
      5. | 2003q1   01jun1974 |
         |--------------------|
      6. | 2003q2   01jul1974 |
      7. | 2003q3   01aug1974 |
      8. | 2003q4   31aug1974 |
      9. | 2004q1   01oct1974 |
     10. | 2004q2   31oct1974 |
         |--------------------|
     11. | 2004q3   01dec1974 |
     12. | 2004q4   31dec1974 |
     13. | 2005q1   30jan1975 |
     14. | 2005q2   02mar1975 |
     15. | 2005q3   01apr1975 |
         |--------------------|
     16. | 2005q4   02may1975 |
     17. | 2006q1   01jun1975 |
     18. | 2006q2   01jul1975 |
     19. | 2006q3   01aug1975 |
     20. | 2006q4   01sep1975 |
         |--------------------|
     21. | 2007q1   01oct1975 |
     22. | 2007q2   31oct1975 |
     23. | 2007q3   01dec1975 |
     24. | 2007q4   31dec1975 |
     25. | 2008q1   31jan1976 |
         |--------------------|
     26. | 2008q2   01mar1976 |
     27. | 2008q3   01apr1976 |
     28. | 2008q4   01may1976 |
     29. | 2009q1   31may1976 |
     30. | 2009q2   01jul1976 |
         |--------------------|
     31. | 2009q3   31jul1976 |
     32. | 2009q4   31aug1976 |
     33. | 2010q1   30sep1976 |
     34. | 2010q2   30oct1976 |
     35. | 2010q3   30nov1976 |
         |--------------------|
     36. | 2010q4   31dec1976 |
     37. | 2011q1   30jan1977 |
     38. | 2011q2   01mar1977 |
     39. | 2011q3   01apr1977 |
     40. | 2011q4   01may1977 |
         |--------------------|
     41. | 2012q1   01jun1977 |
     42. | 2012q2   01jul1977 |
     43. | 2012q3   01aug1977 |
     44. | 2012q4   31aug1977 |
     45. | 2013q1   30sep1977 |
         |--------------------|
     46. | 2013q2   31oct1977 |
     47. | 2013q3   30nov1977 |
     48. | 2013q4   31dec1977 |
     49. | 2014q1   30jan1978 |
     50. | 2014q2   01mar1978 |
         |--------------------|
     51. | 2014q3   01apr1978 |
     52. | 2014q4   02may1978 |
     53. | 2015q1   01jun1978 |
     54. | 2015q2   01jul1978 |
     55. | 2015q3   01aug1978 |
         |--------------------|
     56. | 2015q4   31aug1978 |
     57. | 2016q1   01oct1978 |
     58. | 2016q2   31oct1978 |
     59. | 2016q3   01dec1978 |
     60. | 2016q4   31dec1978 |
         |--------------------|
     61. | 2017q1   30jan1979 |
     62. | 2017q2   02mar1979 |
     63. | 2017q3   01apr1979 |
     64. | 2017q4   02may1979 |
     65. | 2018q1   01jun1979 |
         |--------------------|
     66. | 2018q2   01jul1979 |
     67. | 2018q3   01aug1979 |
     68. | 2018q4   01sep1979 |
     69. | 2019q1   01oct1979 |
     70. | 2019q2   31oct1979 |
         |--------------------|
     71. | 2019q3   01dec1979 |
     72. | 2019q4   31dec1979 |
     73. | 2020q1   31jan1980 |
     74. | 2020q2   01mar1980 |
     75. | 2020q3   01apr1980 |
         |--------------------|
     76. | 2020q4   01may1980 |
     77. | 2021q1   31may1980 |
     78. | 2021q2   01jul1980 |
     79. | 2021q3   31jul1980 |
     80. | 2021q4   31aug1980 |
         |--------------------|
     81. | 2022q1   30sep1980 |
     82. | 2022q2   30oct1980 |
     83. | 2022q3   30nov1980 |
     84. | 2022q4   31dec1980 |
         +--------------------+
    So, to get a quarterly date, you need to pull a quarterly date out of a daily date that is presumably the last day in each quarter.

    You don't say how you produced wanted, but as you fear it is wrong. The numbers seem to jump by 30 or 31 each time.

    Your other questions seem to need a call to collapse.

    Comment


    • #3
      Originally posted by Nick Cox View Post
      With your data example (thanks, except please use # not <> in the forum software to show CODE) I did this:

      Code:
      . gen qdate = qofd(daily(endingdateofstatistics, "YMD"))
      
      . format qdate %tq
      
      .
      . format wanted %td
      
      .
      . list qdate wanted
      
      +--------------------+
      | qdate wanted |
      |--------------------|
      1. | 2002q1 30jan1974 |
      2. | 2002q2 01mar1974 |
      3. | 2002q3 01apr1974 |
      4. | 2002q4 02may1974 |
      5. | 2003q1 01jun1974 |
      |--------------------|
      6. | 2003q2 01jul1974 |
      7. | 2003q3 01aug1974 |
      8. | 2003q4 31aug1974 |
      9. | 2004q1 01oct1974 |
      10. | 2004q2 31oct1974 |
      |--------------------|
      11. | 2004q3 01dec1974 |
      12. | 2004q4 31dec1974 |
      13. | 2005q1 30jan1975 |
      14. | 2005q2 02mar1975 |
      15. | 2005q3 01apr1975 |
      |--------------------|
      16. | 2005q4 02may1975 |
      17. | 2006q1 01jun1975 |
      18. | 2006q2 01jul1975 |
      19. | 2006q3 01aug1975 |
      20. | 2006q4 01sep1975 |
      |--------------------|
      21. | 2007q1 01oct1975 |
      22. | 2007q2 31oct1975 |
      23. | 2007q3 01dec1975 |
      24. | 2007q4 31dec1975 |
      25. | 2008q1 31jan1976 |
      |--------------------|
      26. | 2008q2 01mar1976 |
      27. | 2008q3 01apr1976 |
      28. | 2008q4 01may1976 |
      29. | 2009q1 31may1976 |
      30. | 2009q2 01jul1976 |
      |--------------------|
      31. | 2009q3 31jul1976 |
      32. | 2009q4 31aug1976 |
      33. | 2010q1 30sep1976 |
      34. | 2010q2 30oct1976 |
      35. | 2010q3 30nov1976 |
      |--------------------|
      36. | 2010q4 31dec1976 |
      37. | 2011q1 30jan1977 |
      38. | 2011q2 01mar1977 |
      39. | 2011q3 01apr1977 |
      40. | 2011q4 01may1977 |
      |--------------------|
      41. | 2012q1 01jun1977 |
      42. | 2012q2 01jul1977 |
      43. | 2012q3 01aug1977 |
      44. | 2012q4 31aug1977 |
      45. | 2013q1 30sep1977 |
      |--------------------|
      46. | 2013q2 31oct1977 |
      47. | 2013q3 30nov1977 |
      48. | 2013q4 31dec1977 |
      49. | 2014q1 30jan1978 |
      50. | 2014q2 01mar1978 |
      |--------------------|
      51. | 2014q3 01apr1978 |
      52. | 2014q4 02may1978 |
      53. | 2015q1 01jun1978 |
      54. | 2015q2 01jul1978 |
      55. | 2015q3 01aug1978 |
      |--------------------|
      56. | 2015q4 31aug1978 |
      57. | 2016q1 01oct1978 |
      58. | 2016q2 31oct1978 |
      59. | 2016q3 01dec1978 |
      60. | 2016q4 31dec1978 |
      |--------------------|
      61. | 2017q1 30jan1979 |
      62. | 2017q2 02mar1979 |
      63. | 2017q3 01apr1979 |
      64. | 2017q4 02may1979 |
      65. | 2018q1 01jun1979 |
      |--------------------|
      66. | 2018q2 01jul1979 |
      67. | 2018q3 01aug1979 |
      68. | 2018q4 01sep1979 |
      69. | 2019q1 01oct1979 |
      70. | 2019q2 31oct1979 |
      |--------------------|
      71. | 2019q3 01dec1979 |
      72. | 2019q4 31dec1979 |
      73. | 2020q1 31jan1980 |
      74. | 2020q2 01mar1980 |
      75. | 2020q3 01apr1980 |
      |--------------------|
      76. | 2020q4 01may1980 |
      77. | 2021q1 31may1980 |
      78. | 2021q2 01jul1980 |
      79. | 2021q3 31jul1980 |
      80. | 2021q4 31aug1980 |
      |--------------------|
      81. | 2022q1 30sep1980 |
      82. | 2022q2 30oct1980 |
      83. | 2022q3 30nov1980 |
      84. | 2022q4 31dec1980 |
      +--------------------+
      So, to get a quarterly date, you need to pull a quarterly date out of a daily date that is presumably the last day in each quarter.


      You don't say how you produced wanted, but as you fear it is wrong. The numbers seem to jump by 30 or 31 each time.



      Your other questions seem to need a call to collapse.

      Thank very much for your reply

      Here is how I created quarterly data
      HTML Code:
      gen wanted = qofd(dofm(daily( endingdateofstatistics, "YMD")))
      format wanted %tq
      One more think
      Do you think I should make the date of the last day constant like either 31 or 30?
      I actually used the real date of the database for example March ends on 31st of the month while June ends on 30th
      So, please should I change ?
      if yes, Is it possible to do it through a code?

      Lastly, for the other question, I did not understand what do you mean by call to Collapse please ?
      do you mean ask separate question ?

      thank you very much

      Comment


      • #4
        On your first question: I showed how to get a quarterly date. I can't see that you need anything else in terms of dates, except to work out why your code was wrong.

        Here is a old and simple debugging trick. Try code out calculator-style and make heavy use of display or di using specific examples you can check.

        This dialog(ue) shows how to get where you want to be:

        Code:
        . * we have a daily date, but it's string 
        
        . di "2002-03-31"
        2002-03-31
        
        . * so we push it through daily() (or (date))
        
        . di daily("2002-03-31", "YMD") 
        15430
        
        . 
        . * did that work? check with daily date display format 
        
        . di %td daily("2002-03-31", "YMD") 
        31mar2002
        
        . * we need a quarterly date, so push it through qofd()
        
        . di qofd(daily("2002-03-31", "YMD"))
        168
        
        . * did that work? check with quarter date display format
        
        . di %tq qofd(daily("2002-03-31", "YMD")) 
        2002q1
        You had qofd(dofm(daily())) but the dofm() call messes up the calculation. daily() yields a daily date, but dofm() extracts a daily date from a monthly date. It's unneeded, and indeed it's wrong.

        On the other questions, the suggestion was to look at the
        collapse command. See my use of fonts there, as in the Stata documentation.

        Code:
        help collapse

        Comment


        • #5
          Originally posted by Nick Cox View Post
          On your first question: I showed how to get a quarterly date. I can't see that you need anything else in terms of dates, except to work out why your code was wrong.

          Here is a old and simple debugging trick. Try code out calculator-style and make heavy use of display or di using specific examples you can check.

          This dialog(ue) shows how to get where you want to be:

          Code:
          . * we have a daily date, but it's string
          
          . di "2002-03-31"
          2002-03-31
          
          . * so we push it through daily() (or (date))
          
          . di daily("2002-03-31", "YMD")
          15430
          
          .
          . * did that work? check with daily date display format
          
          . di %td daily("2002-03-31", "YMD")
          31mar2002
          
          . * we need a quarterly date, so push it through qofd()
          
          . di qofd(daily("2002-03-31", "YMD"))
          168
          
          . * did that work? check with quarter date display format
          
          . di %tq qofd(daily("2002-03-31", "YMD"))
          2002q1
          You had qofd(dofm(daily())) but the dofm() call messes up the calculation. daily() yields a daily date, but dofm() extracts a daily date from a monthly date. It's unneeded, and indeed it's wrong.

          On the other questions, the suggestion was to look at the
          collapse command. See my use of fonts there, as in the Stata documentation.

          Code:
          help collapse
          Yes yes your command worked very well.
          Thank you very much

          I will read the help's details of collapse

          Comment


          • #6
            Dear Nick Cox
            I know I amy disturb you but I need to understand it well
            what I need is
            as you know each company has four qaurters per year. I need to calculate the mean of these four qaurters per year and for each company in my dataset.
            after that I well keep only one observation for each company in each year.
            For example if I have 100 companies for 5 years. using quarterly data I will have 2000 observations. when I finish calculating the mean for a specific variable.
            I need to have only 500 observation ( 100*5).
            So, could you please help me

            I have tried to calculate the mean using the following
            HTML Code:
            ge = ( var[_n-4]+var[_n-3]+ var [_n-2]+ var [_n-2])/4
            I got inccorect means.
            it is like the stata does not recognize or sort each company at one single year.
            Then i try to use the command of mean sorted by ID and qdate.
            HTML Code:
            bysort code qdate : egen varmean=mean( var)
            I faced a problem that each quarter be considered as separated time.

            please help
            Best

            Comment


            • #7
              Code:
               
               ( var[_n-4]+var[_n-3]+ var [_n-2]+ var [_n-2])/4
              corrected to
              Code:
                 
               (var[_n-4]+var[_n-3]+ var[_n-2]+ var[_n-1])/4
              gives you a running mean and won't pay any attention to companies. If you have quarterly values for each company and quarter, then getting means over companies and years is another collapse using a tear
              Code:
              gen year = yofd(dofq(qdate)) 
              and company identifier.

              Comment


              • #8
                Originally posted by Nick Cox View Post
                Code:
                ( var[_n-4]+var[_n-3]+ var [_n-2]+ var [_n-2])/4
                corrected to
                Code:
                (var[_n-4]+var[_n-3]+ var[_n-2]+ var[_n-1])/4
                gives you a running mean and won't pay any attention to companies. If you have quarterly values for each company and quarter, then getting means over companies and years is another collapse using a tear
                Code:
                gen year = yofd(dofq(qdate)) 
                and company identifier.
                Thank you very much now it works well

                Comment

                Working...
                X