Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How can I calculate the variance of every six days (or twelve days) by groups?

    As being said in the title, I now have varlist : TradingDate \ stock_code \ stock_name \ Close_Price

    The question is how can i figure out the variance of every 6 days by stock_code?

    Thanks for everyone's time.

  • #2
    Please read https://www.statalist.org/forums/help#stata and give us a good data example.

    Comment


    • #3
      Originally posted by Nick Cox View Post
      Please read https://www.statalist.org/forums/help#stata and give us a good data example.
      Hi Sir, much sorry about my impropriate post. Here is the data of mine. You can ignore the Chinese character because they are just stock names, and the var 'Filling' can be ignored, too.


      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input str12 TradingDate str6 stkcd str13 ShortName byte Filling double ClosePrice
      "2009-01-01" "000001" "深发展A" 2 2.983
      "2009-01-02" "000001" "深发展A" 2 2.983
      "2009-01-05" "000001" "深发展A" 0 3.062
      "2009-01-06" "000001" "深发展A" 0 3.248
      "2009-01-07" "000001" "深发展A" 0  3.15
      "2009-01-08" "000001" "深发展A" 0 3.027
      "2009-01-09" "000001" "深发展A" 0 3.106
      "2009-01-12" "000001" "深发展A" 0 3.109
      "2009-01-13" "000001" "深发展A" 0 2.986
      "2009-01-14" "000001" "深发展A" 0 3.216
      "2009-01-15" "000001" "深发展A" 0 3.248
      "2009-01-16" "000001" "深发展A" 0 3.348
      "2009-01-19" "000001" "深发展A" 0 3.503
      "2009-01-20" "000001" "深发展A" 0 3.582
      "2009-01-21" "000001" "深发展A" 0 3.717
      "2009-01-22" "000001" "深发展A" 0 3.717
      "2009-01-23" "000001" "深发展A" 0  3.67
      "2009-01-26" "000001" "深发展A" 2  3.67
      "2009-01-27" "000001" "深发展A" 2  3.67
      "2009-01-28" "000001" "深发展A" 2  3.67
      end
      Last edited by Jason Liang; 15 Feb 2023, 03:30.

      Comment


      • #4
        #1 was not inappropriate, just incomplete in terms of information to provide a good answer.

        Your dates are not yet fit for your purposes. Creating a Stata numeric date variable is easy.

        If you use windows of 7 or 14 days that should usually include 5 or 10 working days. Otherwise look into setting up a business calendar.

        Here I illustrate using rangestat from SSC but there are other ways to do it. See the help for rangestat on setting the window using the interval() option.

        Code:
        . gen date = daily(TradingDate, "YMD")
        
        . format date %td 
        
        . 
        . * you need only install once 
        . * ssc install rangestat 
        . 
        . rangestat (count) ClosePrice (variance) ClosePrice, int(date -6 0) by(ShortName)
        
        .
        Results


        Code:
             +----------------------------------------------------------------------------------------+
             | TradingD~e    stkcd   ShortN~e   Filling   Clos~ice        date   CloseP~t   Close~nce |
             |----------------------------------------------------------------------------------------|
          1. | 2009-01-01   000001    深发展A         2      2.983   01jan2009          1           . |
          2. | 2009-01-02   000001    深发展A         2      2.983   02jan2009          2           0 |
          3. | 2009-01-05   000001    深发展A         0      3.062   05jan2009          3   .00208033 |
          4. | 2009-01-06   000001    深发展A         0      3.248   06jan2009          4   .01562733 |
          5. | 2009-01-07   000001    深发展A         0       3.15   07jan2009          5    .0130327 |
             |----------------------------------------------------------------------------------------|
          6. | 2009-01-08   000001    深发展A         0      3.027   08jan2009          5    .0111715 |
          7. | 2009-01-09   000001    深发展A         0      3.106   09jan2009          5    .0073708 |
          8. | 2009-01-12   000001    深发展A         0      3.109   12jan2009          5    .0064825 |
          9. | 2009-01-13   000001    深发展A         0      2.986   13jan2009          5    .0044913 |
         10. | 2009-01-14   000001    深发展A         0      3.216   14jan2009          5    .0078177 |
             |----------------------------------------------------------------------------------------|
         11. | 2009-01-15   000001    深发展A         0      3.248   15jan2009          5     .010757 |
         12. | 2009-01-16   000001    深发展A         0      3.348   16jan2009          5    .0192028 |
         13. | 2009-01-19   000001    深发展A         0      3.503   19jan2009          5    .0359872 |
         14. | 2009-01-20   000001    深发展A         0      3.582   20jan2009          5    .0253188 |
         15. | 2009-01-21   000001    深发展A         0      3.717   21jan2009          5    .0345873 |
             |----------------------------------------------------------------------------------------|
         16. | 2009-01-22   000001    深发展A         0      3.717   22jan2009          5    .0242693 |
         17. | 2009-01-23   000001    深发展A         0       3.67   23jan2009          5    .0087167 |
         18. | 2009-01-26   000001    深发展A         2       3.67   26jan2009          5    .0030387 |
         19. | 2009-01-27   000001    深发展A         2       3.67   27jan2009          5    .0006627 |
         20. | 2009-01-28   000001    深发展A         2       3.67   28jan2009          5    .0004418 |
             +----------------------------------------------------------------------------------------+
        
        .
        Last edited by Nick Cox; 15 Feb 2023, 05:01.

        Comment


        • #5
          Thanks for your code Sir. I ran it on my do-file and it worked, but seemingly I still met with some difficulties on understanding the result:

          Given that you coded 'interval(date -6 0)', why the column '~variance' is 'fully filled' instead of 'by every 6 blanks'?

          Excuse me for my poor understanding regarding STATA. Thank you for your patience again.

          Comment


          • #6
            And one more question: why the column '~count' is '1, 2, 3, 4, 5, 5, 5, ...' ?
            the length of interval is 6, shouldn't it be '1, 2, 3, 4, 5, 6, 1, 2, ..., 6' ?

            Comment


            • #7
              The issue is only how rangestat works. The syntax


              Code:
              int(date -6 0) 
              means look for windows that started up to 6 days ago and end with the present day, so the window is of length up to 7, maximally in this case days -6 -5 -4 -3 -2 -1 0 from the current observation.

              Nothing in the syntax says how many observations will match that prescription. The number of observations could be zero or more than zero or even more than 7, so far as rangestat is concerned.

              For day 1, only one observation matches the window prescription, but rangestat (like summarize) uses sample size MINUS 1 in calculating the variance, which is thus indeterminate.

              For day 2, two observations match but the values of close price are identical, so that the reported variance is zero.

              Much of the point of asking to see the count is that you can ignore incomplete windows if you wish by selecting only results based on 5 observations, or whatever number you wish for.

              Comment


              • #8
                Originally posted by Nick Cox View Post
                The issue is only how rangestat works. The syntax


                Code:
                int(date -6 0) 
                means look for windows that started up to 6 days ago and end with the present day, so the window is of length up to 7, maximally in this case days -6 -5 -4 -3 -2 -1 0 from the current observation.

                Nothing in the syntax says how many observations will match that prescription. The number of observations could be zero or more than zero or even more than 7, so far as rangestat is concerned.

                For day 1, only one observation matches the window prescription, but rangestat (like summarize) uses sample size MINUS 1 in calculating the variance, which is thus indeterminate.

                For day 2, two observations match but the values of close price are identical, so that the reported variance is zero.

                Much of the point of asking to see the count is that you can ignore incomplete windows if you wish by selecting only results based on 5 observations, or whatever number you wish for.
                Thank you Sir. I kind of follow you. So I can reach my goal by keeping the 6th 12th 18th.. observation, right?

                Comment


                • #9
                  If your windows are disjoint, not overlapping, you would be best served by a different approach, such as setting up weekly windows.

                  Comment


                  • #10
                    I totally get it. Thank you Sir. Your reply means a lot to an undergraduate student.

                    Comment

                    Working...
                    X