Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • semi-deviation rolling window

    Dear all,

    Can you please help..
    I am trying to find the semi-standard deviation, looking at the volatility below the mean, using rolling window. I tried but I have missing values, I only need semi-deviation with non-missing values.

    previously I have tried to use "!missing", but couldn't be able to fit it, and I got error.

    my data looks like this:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float id int year double(score n ave_score) float less_score double ESGSTD
    +----------------------------------------------------------+
    | id year score n ave_score less_s~e scoreSTD |
    |----------------------------------------------------------|
    1. | 1 2004 22.68 1 22.68 . . |
    2. | 1 2005 26.76 2 24.72 . . |
    3. | 1 2006 21.34 3 23.593333 21.34 . |
    4. | 1 2007 28.92 4 24.925 . . |
    5. | 1 2008 25.09 5 24.958 . . |
    6. | 1 2009 28.67 5 26.156 . . |
    7. | 1 2010 33.73 5 27.55 . . |
    8. | 1 2011 30.76 5 29.434 . . |
    9. | 1 2012 30.66 5 29.782 . . |
    10. | 1 2013 28.06 5 30.376 28.06 . |
    11. | 1 2014 26.53 5 29.948 26.53 1.0818725 |
    12. | 1 2015 32.83 5 29.768 . 1.0818725 |
    13. | 1 2016 37.63 5 31.142 . 1.0818725 |
    14. | 1 2017 36.78 5 32.366 . 1.0818725 |
    15. | 1 2018 41.75 5 35.104 . . |
    16. | 1 2019 41.27 5 38.052 . . |
    17. | 1 2020 41.02 5 39.69 . . |
    18. | 1 2021 44.03 5 40.97 . . |
    19. | 1 2022 43.19 5 42.252 . . |
    |----------------------------------------------------------|
    20. | 2 2004 . 0 . . . |
    21. | 2 2005 . 0 . . . |
    22. | 2 2006 . 0 . . . |
    23. | 2 2007 . 0 . . . |
    24. | 2 2008 2.61 1 2.61 . . |
    25. | 2 2009 2.54 2 2.575 2.54 . |
    26. | 2 2010 2.84 3 2.6633333 . . |
    27. | 2 2011 2.29 4 2.57 2.29 .1767767 |
    28. | 2 2012 1.81 5 2.418 1.81 .37098968 |
    29. | 2 2013 2.03 5 2.302 2.03 .31647802 |
    30. | 2 2014 1.6 5 2.114 1.6 .29601518 |
    31. | 2 2015 7.62 5 3.07 . .29601518 |
    32. | 2 2016 61.67 5 14.946 . .21501935 |
    33. | 2 2017 61.32 5 26.848 . .30405588 |
    34. | 2 2018 60.75 5 38.592 . . |
    35. | 2 2019 60.97 5 50.466 . . |
    36. | 2 2020 59.72 5 60.886 59.72 . |
    37. | 2 2021 62.71 5 61.094 . . |
    38. | 2 2022 1 5 49.03 1 41.521311 |
    |----------------------------------------------------------|
    39. | 3 2004 . 0 . . . |
    40. | 3 2005 . 0 . . . |
    41. | 3 2006 . 0 . . . |
    42. | 3 2007 . 0 . . . |
    43. | 3 2008 45.57 1 45.57 . . |
    44. | 3 2009 43.41 2 44.49 43.41 . |
    45. | 3 2010 55.29 3 48.09 . . |
    46. | 3 2011 53.49 4 49.44 . . |
    47. | 3 2012 52.35 5 50.022 . . |
    48. | 3 2013 54.57 5 51.822 . . |
    49. | 3 2014 55.45 5 54.23 . . |
    50. | 3 2015 55.46 5 54.264 . . |
    51. | 3 2016 72.39 5 58.044 . . |
    52. | 3 2017 72.47 5 62.068 . . |
    53. | 3 2018 74.56 5 66.066 . . |
    54. | 3 2019 78.44 5 70.664 . . |
    55. | 3 2020 74.53 5 74.478 . . |
    56. | 3 2021 78.81 5 75.762 . . |
    57. | 3 2022 78.66 5 77 . . |
    |----------------------------------------------------------|
    58. | 4 2004 . 0 . . . |
    59. | 4 2005 . 0 . . . |
    60. | 4 2006 . 0 . . . |
    61. | 4 2007 . 0 . . . |
    62. | 4 2008 . 0 . . . |
    63. | 4 2009 . 0 . . . |
    64. | 4 2010 . 0 . . . |
    65. | 4 2011 . 0 . . . |
    66. | 4 2012 . 0 . . . |
    67. | 4 2013 . 0 . . . |
    68. | 4 2014 . 0 . . . |
    69. | 4 2015 . 0 . . . |
    70. | 4 2016 21.8 1 21.8 . . |
    71. | 4 2017 25.9 2 23.85 . . |
    72. | 4 2018 24.44 3 24.046667 . . |
    73. | 4 2019 25.58 4 24.43 . . |
    74. | 4 2020 21.48 5 23.84 21.48 . |
    75. | 4 2021 28.13 5 25.106 . . |
    76. | 4 2022 1 5 20.126 1 14.481547 |
    |----------------------------------------------------------|
    77. | 5 2004 27.4 1 27.4 . . |
    78. | 5 2005 32.63 2 30.015 . . |
    79. | 5 2006 30.64 3 30.223333 . . |
    80. | 5 2007 30.53 4 30.3 . . |
    81. | 5 2008 47.37 5 33.714 . . |
    82. | 5 2009 48.55 5 37.944 . . |
    83. | 5 2010 62.13 5 43.844 . . |
    84. | 5 2011 59.02 5 49.52 . . |
    85. | 5 2012 71.24 5 57.662 . . |
    86. | 5 2013 67.33 5 61.654 . . |
    87. | 5 2014 71.81 5 66.306 . . |
    88. | 5 2015 68 5 67.48 . . |
    89. | 5 2016 58.38 5 67.352 58.38 . |
    90. | 5 2017 69.36 5 66.976 . . |
    91. | 5 2018 69.66 5 67.442 . . |
    92. | 5 2019 65.86 5 66.252 65.86 5.2891584 |
    93. | 5 2020 71.94 5 67.04 . 5.2891584 |
    94. | 5 2021 69.7 5 69.304 . . |
    95. | 5 2022 71.71 5 69.774 . . |
    |----------------------------------------------------------|
    96. | 6 2004 . 0 . . . |
    97. | 6 2005 . 0 . . . |
    98. | 6 2006 . 0 . . . |
    99. | 6 2007 . 0 . . . |
    100. | 6 2008 . 0 . . . |
    +----------------------------------------------------------+

    .
    I have used the following steps:

    rangestat (count) n= score (mean) ave_score = score , interval( year -4 0) by(id)
    gen less_score = .
    replace less_score = score if (score< ave_score )
    rangestat (sd) scoreSTD= less_score , by(id) interval(year -4 0)


  • #2
    Please try again with

    Code:
    dataex  id year score
    What you're showing won't work as code we can use without editing.

    I have to say that the SD of scores below the mean in intervals of length 5 could imply using 4 values or 1 value if there is an outlier, but it's more likely to imply using 2 or 3 values. Is this enough for interpretable results or indeed is that the point?
    Last edited by Nick Cox; 29 Nov 2023, 05:46.

    Comment


    • #3
      Thank you Nick for responding and the point you raised.

      this is really a concern. but I can use different way. using semi-variance :

      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input float id int year double score float(mean less_mean mean2 SquareOfDifference)
      1 2004 22.68 32.721054 22.68    26.947   18.20729
      1 2005 26.76 32.721054 26.76    26.947 .034969103
      1 2006 21.34 32.721054 21.34    26.947  31.438454
      1 2007 28.92 32.721054 28.92    26.947  3.8927274
      1 2008 25.09 32.721054 25.09    26.947    3.44845
      1 2009 28.67 32.721054 28.67    26.947  2.9687276
      1 2010 33.73 32.721054     .    26.947          .
      1 2011 30.76 32.721054 30.76    26.947  14.538967
      1 2012 30.66 32.721054 30.66    26.947  13.786365
      1 2013 28.06 32.721054 28.06    26.947  1.2387667
      1 2014 26.53 32.721054 26.53    26.947  .17388885
      1 2015 32.83 32.721054     .    26.947          .
      1 2016 37.63 32.721054     .    26.947          .
      1 2017 36.78 32.721054     .    26.947          .
      1 2018 41.75 32.721054     .    26.947          .
      1 2019 41.27 32.721054     .    26.947          .
      1 2020 41.02 32.721054     .    26.947          .
      1 2021 44.03 32.721054     .    26.947          .
      1 2022 43.19 32.721054     .    26.947          .
      2 2004     .  26.09867     . 2.7044444          .
      2 2005     .  26.09867     . 2.7044444          .
      2 2006     .  26.09867     . 2.7044444          .
      2 2007     .  26.09867     . 2.7044444          .
      2 2008  2.61  26.09867  2.61 2.7044444 .008919766
      2 2009  2.54  26.09867  2.54 2.7044444 .027041975
      2 2010  2.84  26.09867  2.84 2.7044444 .018375294
      2 2011  2.29  26.09867  2.29 2.7044444   .1717642
      2 2012  1.81  26.09867  1.81 2.7044444   .8000309
      2 2013  2.03  26.09867  2.03 2.7044444   .4548753
      2 2014   1.6  26.09867   1.6 2.7044444  1.2197974
      2 2015  7.62  26.09867  7.62 2.7044444  24.162685
      2 2016 61.67  26.09867     . 2.7044444          .
      2 2017 61.32  26.09867     . 2.7044444          .
      2 2018 60.75  26.09867     . 2.7044444          .
      2 2019 60.97  26.09867     . 2.7044444          .
      2 2020 59.72  26.09867     . 2.7044444          .
      2 2021 62.71  26.09867     . 2.7044444          .
      2 2022     1  26.09867     1 2.7044444  2.9051306
      3 2004     .     63.03     .  51.94875          .
      3 2005     .     63.03     .  51.94875          .
      3 2006     .     63.03     .  51.94875          .
      3 2007     .     63.03     .  51.94875          .
      3 2008 45.57     63.03 45.57  51.94875   40.68845
      3 2009 43.41     63.03 43.41  51.94875   72.91025
      3 2010 55.29     63.03 55.29  51.94875   11.16396
      3 2011 53.49     63.03 53.49  51.94875  2.3754582
      3 2012 52.35     63.03 52.35  51.94875   .1610007
      3 2013 54.57     63.03 54.57  51.94875   6.870952
      3 2014 55.45     63.03 55.45  51.94875   12.25876
      3 2015 55.46     63.03 55.46  51.94875  12.328874
      3 2016 72.39     63.03     .  51.94875          .
      3 2017 72.47     63.03     .  51.94875          .
      3 2018 74.56     63.03     .  51.94875          .
      3 2019 78.44     63.03     .  51.94875          .
      3 2020 74.53     63.03     .  51.94875          .
      3 2021 78.81     63.03     .  51.94875          .
      3 2022 78.66     63.03     .  51.94875          .
      4 2004     .     21.19     .         1          .
      4 2005     .     21.19     .         1          .
      4 2006     .     21.19     .         1          .
      4 2007     .     21.19     .         1          .
      4 2008     .     21.19     .         1          .
      4 2009     .     21.19     .         1          .
      4 2010     .     21.19     .         1          .
      4 2011     .     21.19     .         1          .
      4 2012     .     21.19     .         1          .
      4 2013     .     21.19     .         1          .
      4 2014     .     21.19     .         1          .
      4 2015     .     21.19     .         1          .
      4 2016  21.8     21.19     .         1          .
      4 2017  25.9     21.19     .         1          .
      4 2018 24.44     21.19     .         1          .
      4 2019 25.58     21.19     .         1          .
      4 2020 21.48     21.19     .         1          .
      4 2021 28.13     21.19     .         1          .
      4 2022     1     21.19     1         1          0
      5 2004  27.4     57.54  27.4  36.18667   77.20555
      5 2005 32.63     57.54 32.63  36.18667  12.649882
      5 2006 30.64     57.54 30.64  36.18667   30.76554
      5 2007 30.53     57.54 30.53  36.18667   31.99789
      5 2008 47.37     57.54 47.37  36.18667  125.06688
      5 2009 48.55     57.54 48.55  36.18667  152.85194
      5 2010 62.13     57.54     .  36.18667          .
      5 2011 59.02     57.54     .  36.18667          .
      5 2012 71.24     57.54     .  36.18667          .
      5 2013 67.33     57.54     .  36.18667          .
      5 2014 71.81     57.54     .  36.18667          .
      5 2015    68     57.54     .  36.18667          .
      5 2016 58.38     57.54     .  36.18667          .
      5 2017 69.36     57.54     .  36.18667          .
      5 2018 69.66     57.54     .  36.18667          .
      5 2019 65.86     57.54     .  36.18667          .
      5 2020 71.94     57.54     .  36.18667          .
      5 2021  69.7     57.54     .  36.18667          .
      5 2022 71.71     57.54     .  36.18667          .
      6 2004     .     35.34     .         1          .
      6 2005     .     35.34     .         1          .
      6 2006     .     35.34     .         1          .
      6 2007     .     35.34     .         1          .
      6 2008     .     35.34     .         1          .
      end
      ------------------ copy up to and including the previous line ------------------

      Listed 100 out of 49628 observations
      Use the count() option to list more



      I have used the following steps:

      by id: egen mean = mean(score)
      gen less_mean =.
      replace less_mean = score if ( score < mean )

      **** generate new mean
      bys id: egen mean2 = mean( less_mean )
      by id: gen SquareOfDifference = ( less_mean - mean2 )^2

      so the semi-variance would be the result of (total of SquareOfDifference / number of nonmissing observations of SquareOfDifference)

      Can you please guide me how to use rangestat to make it five periods where 2008 is the first one.

      Thank you

      Comment


      • #4
        Try this for size:

        Code:
        webuse grunfeld, clear 
        
        mata:  
        mata clear
        real rowvector semidev(real colvector X) {
            X = select(X, X :< mean(X))
            return((length(X) , (length(X) > 1 ? sqrt(variance(X)) : .)))  
        } 
        end 
        
        rangestat (semidev) invest, int(year -4 0) by(company)
        
        
        
        . l invest year semidev? in 1/20
        
             +--------------------------------------+
             | invest   year   semidev1    semidev2 |
             |--------------------------------------|
          1. |  317.6   1935          0           . |
          2. |  391.8   1936          1           . |
          3. |  410.6   1937          1           . |
          4. |  257.7   1938          2   42.355692 |
          5. |  330.8   1939          3   38.956932 |
             |--------------------------------------|
          6. |  461.2   1940          2   51.689488 |
          7. |    512   1941          2   51.689488 |
          8. |    448   1942          2   51.689488 |
          9. |  499.6   1943          2   82.872923 |
         10. |  547.5   1944          2   9.3338181 |
             |--------------------------------------|
         11. |  561.2   1945          3   33.941912 |
         12. |  688.1   1946          3   49.761464 |
         13. |  568.9   1947          4   31.086444 |
         14. |  529.2   1948          4   17.416281 |
         15. |  555.1   1949          4   17.218791 |
             |--------------------------------------|
         16. |  642.9   1950          3   20.154985 |
         17. |  755.9   1951          3   20.154985 |
         18. |  891.2   1952          3   59.592155 |
         19. | 1304.4   1953          3   100.66322 |
         20. | 1486.7   1954          3   124.31678 |
             +--------------------------------------+
        
        . su invest in 1/4
        
            Variable |        Obs        Mean    Std. dev.       Min        Max
        -------------+---------------------------------------------------------
              invest |          4     344.425    70.39016      257.7      410.6
        
        . su invest in 1/4 if invest < 344.425
        
            Variable |        Obs        Mean    Std. dev.       Min        Max
        -------------+---------------------------------------------------------
              invest |          2      287.65    42.35569      257.7      317.6
        
        .

        Comment


        • #5
          Thank you so much Dr. Nick

          Comment

          Working...
          X