Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Rolling average of a variable in panel data

    I have currency spot rate panel data month/country. For each country, at time t, I am trying to generate a variable that would give me the average spot rate average between t-4.5 year and t-5.5 years. In other words, the spot rate average between month t-54 and month 66.

    My data look as follow:
    Code:
     
    ym Country Spot_T_Con
    1973m8 Euro 1.804785
    1973m9 Euro 1.820527
    1973m10 Euro 1.804761
    1973m11 Euro 1.715059
    1973m12 Euro 1.702131
    1974m1 Euro 1.603125
    1974m2 Euro 1.623765
    1974m3 Euro 1.718324
    1974m4 Euro 1.735353
    1978m1 Euro 1.497064
    1978m2 Euro 1.521075
    1978m3 Euro 1.528731
    1978m4 Euro 1.509551
    1978m5 Euro 1.500049
    1978m6 Euro 1.520005
    1978m7 Euro 1.550397
    1978m8 Euro 1.575279
    1978m9 Euro 1.600133
    1978m10 Euro 1.714607
    1978m11 Euro 1.580072
    1978m12 Euro 1.648871
    1979m1 Euro 1.613518
    1979m2 Euro 1.625079
    1979m3 Euro 1.621817
    1979m4 Euro 1.605765
    1979m5 Euro 1.596492
    1979m6 Euro 1.641727
    1979m7 Euro 1.64986
    1973m8 Australia 1.418551
    1973m9 Australia 1.489816
    1973m10 Australia 1.488314
    1973m11 Australia 1.486247
    1973m12 Australia 1.48958
    1994m1 Canada 0.753943
    1994m2 Canada 0.739519
    1994m3 Canada 0.723282
    1994m4 Canada 0.723875
    1994m5 Canada 0.721248
    1994m6 Canada 0.723802
    1994m7 Canada 0.723536
    I could not find a similar dataset (monthly panel data), so I'd be happy to adapt the solution from daily time series data. Instance:
    Code:
    sysuse bbispot(ask).dta, clear
    
    ** 4.5-.5.5 years close rolling average?
    Thank you in advance

  • #2
    So, the first thing is to convert 4.5 years and 5.5 years into numbers of months. Then -rangestat- does the rest.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float monthly_date str16 Country float Spot_T_Con
    163 "Euro"      1.804785
    164 "Euro"      1.820527
    165 "Euro"      1.804761
    166 "Euro"      1.715059
    167 "Euro"      1.702131
    168 "Euro"      1.603125
    169 "Euro"      1.623765
    170 "Euro"      1.718324
    171 "Euro"      1.735353
    216 "Euro"      1.497064
    217 "Euro"      1.521075
    218 "Euro"      1.528731
    219 "Euro"      1.509551
    220 "Euro"      1.500049
    221 "Euro"      1.520005
    222 "Euro"      1.550397
    223 "Euro"      1.575279
    224 "Euro"      1.600133
    225 "Euro"      1.714607
    226 "Euro"      1.580072
    227 "Euro"      1.648871
    228 "Euro"      1.613518
    229 "Euro"      1.625079
    230 "Euro"      1.621817
    231 "Euro"      1.605765
    232 "Euro"      1.596492
    233 "Euro"      1.641727
    163 "Australia" 1.418551
    164 "Australia" 1.489816
    165 "Australia" 1.488314
    166 "Australia" 1.486247
    167 "Australia"  1.48958
    408 "Canada"     .753943
    409 "Canada"     .739519
    410 "Canada"     .723282
    411 "Canada"     .723875
    412 "Canada"     .721248
    413 "Canada"     .723802
    414 "Canada"     .723536
    end
    format %tm monthly_date
    
    //    RECORD # OF MONTHS IN 4.5 AND 5.5 YEARS
    local yrs_4_5 = 4.5*12
    local yrs_5_5 = 5.5*12
    
    rangestat (mean) wanted = Spot_T_Con, interval(monthly_date -`yrs_5_5' -`yrs_4_5')
    Note: You need to install Robert Picard, Nick Cox & Roberto Ferrer's -rangestat- command from SSC if you don't already have it.

    Your code, -sysuse bbispot(ask).dta, clear-, produces only error messages: bbispot (ask).dta, having an embedded blank, would have to be enclosed in quotes for Stata to treat it as a filename. But even so, on my installation there is no such file. I'm not sure what you have in mind there, or where it came from. In any case, you need not resort to finding proxy files like this. You can post an example of your actual data using the -dataex- command. It is available from SSC and is the preferred way to show example data. You can see how it is used in the code I have shown in this response, where I imported your data in a more clumsy way into Stata and then used -dataex- to create the example. When you use -dataex-, you enable those who want to help you to create a completely faithful replica of your Stata example, including things like storage types, labeling, and formatting, with a simple copy/paste operation. In the future, be sure to use -dataex- to post example Stata data, every time.

    Comment


    • #3
      Francois: See also the thread https://www.statalist.org/forums/for...window-average where Clyde gave you essentially the same advice on essentially the same question.

      Comment


      • #4
        Hi Clyde and Nick,

        I have a similar problem. I am trying to generate average monthly deaths for different states between Jan 2015 and Dec 2019. What I want to get is monthly average deaths for all the states for all the months. I have 15 states and I want average deaths for each month for the period Jan 2015 and Dec 2019.



        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input double deaths float(year date) long state2 float(date2 period) double deaths_avg
        21379 2015 660  9 660 1584 .
        36982 2015 660  8 660 1584 .
        19624 2015 660 12 660 1584 .
        17316 2015 661 12 661 1584 .
        31974 2015 661  8 661 1584 .
        17970 2015 661  9 661 1584 .
        16406 2015 662 12 662 1584 .
        32928 2015 662  8 662 1584 .
        19077 2015 662  9 662 1584 .
        17808 2015 663  9 663 1584 .
        29152 2015 663  8 663 1584 .
        15576 2015 663 12 663 1584 .
        30338 2015 664  8 664 1584 .
        15454 2015 664 12 664 1584 .
        19047 2015 664  9 664 1584 .
        33092 2015 665  8 665 1596 .
        18360 2015 665  9 665 1596 .
        15614 2015 665 12 665 1596 .
        15095 2015 666 12 666 1596 .
        20958 2015 666  9 666 1596 .
        33045 2015 666  8 666 1596 .
        15619 2015 667 12 667 1596 .
        33002 2015 667  8 667 1596 .
        20884 2015 667  9 667 1596 .
        18898 2015 668  9 668 1596 .
        15986 2015 668 12 668 1596 .
        31873 2015 668  8 668 1596 .
        33008 2015 669  8 669 1596 .
        15975 2015 669 12 669 1596 .
        19552 2015 669  9 669 1596 .
        20813 2015 670  9 670 1608 .
        33200 2015 670  8 670 1608 .
        17089 2015 670 12 670 1608 .
        35137 2015 671  8 671 1608 .
        19707 2015 671 12 671 1608 .
        21236 2015 671  9 671 1608 .
        18193 2016 672 12 672 1608 .
        34598 2016 672  8 672 1608 .
        21145 2016 672  9 672 1608 .
        18641 2016 673  9 673 1608 .
        32206 2016 673  8 673 1608 .
        19079 2016 673 12 673 1608 .
        16119 2016 674 12 674 1608 .
        34855 2016 674  8 674 1608 .
        19142 2016 674  9 674 1608 .
        32274 2016 675  8 675 1620 .
        15880 2016 675 12 675 1620 .
        20169 2016 675  9 675 1620 .
        34292 2016 676  8 676 1620 .
        19098 2016 676  9 676 1620 .
        16331 2016 676 12 676 1620 .
        20337 2016 677  9 677 1620 .
        15902 2016 677 12 677 1620 .
        35307 2016 677  8 677 1620 .
        35526 2016 678  8 678 1620 .
        16566 2016 678 12 678 1620 .
        23432 2016 678  9 678 1620 .
        22577 2016 679  9 679 1620 .
        17813 2016 679 12 679 1620 .
        39293 2016 679  8 679 1620 .
        35780 2016 680  8 680 1632 .
        21314 2016 680  9 680 1632 .
        17451 2016 680 12 680 1632 .
        16624 2016 681 12 681 1632 .
        19886 2016 681  9 681 1632 .
        33560 2016 681  8 681 1632 .
        36769 2016 682  8 682 1632 .
        20123 2016 682 12 682 1632 .
        18856 2016 682  9 682 1632 .
        36314 2016 683  8 683 1632 .
        20303 2016 683  9 683 1632 .
        20386 2016 683 12 683 1632 .
        41135 2017 684  8 684 1632 .
        20326 2017 684  9 684 1632 .
        19081 2017 684 12 684 1632 .
        35940 2017 685  8 685 1644 .
        17634 2017 685 12 685 1644 .
        18128 2017 685  9 685 1644 .
        39998 2017 686  8 686 1644 .
        16784 2017 686 12 686 1644 .
        19013 2017 686  9 686 1644 .
        19014 2017 687  9 687 1644 .
        15544 2017 687 12 687 1644 .
        35486 2017 687  8 687 1644 .
        20053 2017 688  9 688 1644 .
        38918 2017 688  8 688 1644 .
        16204 2017 688 12 688 1644 .
        16441 2017 689 12 689 1644 .
        37890 2017 689  8 689 1644 .
        22597 2017 689  9 689 1644 .
        39238 2017 690  8 690 1656 .
        16089 2017 690 12 690 1656 .
        26302 2017 690  9 690 1656 .
        23101 2017 691  9 691 1656 .
        41390 2017 691  8 691 1656 .
        17086 2017 691 12 691 1656 .
        40959 2017 692  8 692 1656 .
        21513 2017 692  9 692 1656 .
        16407 2017 692 12 692 1656 .
        20681 2017 693  9 693 1656 .
        end
        format %tmm/y date
        label values state2 state2
        label def state2 8 "Karnataka", modify
        label def state2 9 "Kerala", modify
        label def state2 12 "Punjab", modify

        I tried the rangestat code but I guess my understanding is wrong somewhere. As you can see that the deaths_avg has only missing values.

        rangestat (mean) deaths_avg =deaths, interval ( date2 660 719 ) by(state2)

        Comment


        • #5
          660 and 719 aren't good choices as offsets which is what rangestat is expecting here. It seems that you want

          Code:
          rangestat (mean) deaths_avg =deaths, interval ( date2 . .  ) by(state2)
          except that if there are any observations outside [660, 719] you need to exclude them too.

          interval(keyvar low high) is required. keyvar is a numeric variable. The lower and
          upper bound of the closed interval to use for each observation can be specified using
          a numeric variable, a #, or a system missing value. If a # is used, the bound for
          each observation is computed by adding # to keyvar
          .

          Otherwise put, rangestat is being told by you to look roughly 55 to 60 years into the future. You want, rather, 660 and 719 to be treated as literal dates. You could also do that with


          Code:
          gen low = 660 
          gen high = 719
          Code:
          rangestat (mean) deaths_avg =deaths, interval ( date2 low high) by(state2)
          The syntax defaults to calculations with moving windows, not over a single window.

          Comment


          • #6
            Originally posted by Nick Cox View Post
            660 and 719 aren't good choices as offsets which is what rangestat is expecting here. It seems that you want

            Code:
            rangestat (mean) deaths_avg =deaths, interval ( date2 . . ) by(state2)
            except that if there are any observations outside [660, 719] you need to exclude them too.



            Otherwise put, rangestat is being told by you to look roughly 55 to 60 years into the future. You want, rather, 660 and 719 to be treated as literal dates. You could also do that with


            Code:
            gen low = 660
            gen high = 719
            Code:
            rangestat (mean) deaths_avg =deaths, interval ( date2 low high) by(state2)
            The syntax defaults to calculations with moving windows, not over a single window.
            Thanks Nick! I used the following code

            gen date= ym( year, month)
            format date %tmm/y
            encode state, gen(state2)
            egen month_state = group(state2 month)
            *drop if year>=2020
            destring lgd_state_id, replace
            format %02.0f lgd_state_id

            gen date2=dofm(date)

            gen low = 20089
            gen high = 21884

            rangestat (mean) deaths_avg =deaths, interval ( date2 low high) by(state2)
            I am getting deaths_avg but it is at the yearly level. I wanted monthly average for a 5 year period. For me the following worked but I was wondering if there is a neater/cleaner way of doing things.

            gen date= ym( year, month)
            format date %tmm/y
            encode state, gen(state2)
            egen month_state = group(state2 month)
            drop if year>=2020
            destring lgd_state_id, replace
            format %02.0f lgd_state_id
            collapse (mean) deaths month state2 lgd_state_id , by(month_state)
            label value state2 state2
            tostring lgd_state_id, replace usedisplayformat

            rename deaths deaths_avg
            I would like to thank for the excellent example you showcased here-
            https://www.statalist.org/forums/for...98#post1608198


            I managed to replicate the following. Do let me know if you have any suggestions how to improve this!
            Click image for larger version

Name:	excess_deaths_fabplot.png
Views:	1
Size:	630.6 KB
ID:	1626476

            Comment


            • #7
              You appear to have good data in terms of state names and Stata monthly dates. Why is it interesting or helpful to recast these data in terms of daily dates?

              Comment


              • #8
                Originally posted by Nick Cox View Post
                You appear to have good data in terms of state names and Stata monthly dates. Why is it interesting or helpful to recast these data in terms of daily dates?
                Hi Nick,

                Thank you for your response. I am not sure if I follow your question. What I want to do with the data is calculate the monthly mean deaths for a five year period and use that to calculate the Excess death P-score below -

                Excess death P-score = ( (Deaths period 2020-2021 - Average deaths Period 2015-2019) / (Average deaths Period 2015-2019))*100

                Comment


                • #9
                  You had monthly dates and then re-phrased your analysis in terms of daily dates. That appears to have been quite unnecessary.

                  Comment

                  Working...
                  X