Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generate Moving Average

    Dear Statalist,

    I have a data set with around 300,000 observations that contains YEAR, TITEL and COLOUR. The variable YEAR ranges from 1800 to 2000 and there are 18 manifestations for COLOUR.
    I want to create a graph with a Moving Average, that summarizes the amount of titles (y-axis) over 10/25/50 years (x-axis) by colour. There are repeated values of TITLES per YEAR and COLOUR.

    I already tried to follow the manual of “Stata 7: How can I calculate moving averages for panel data?”
    (https://www.stata.com/support/faqs/s...nd-panel-data/)

    The problem is, that I can not declare my data to be a time series. Stata gives me the error message:
    “repeated time values within panel”

    How can I create a Moving Average and a table that contains the respective results.

    Kind regards and thank you very much,
    Rike Lich

  • #2
    Clear as you may be in your own mind about what your data look like and what you want to do with them, your description leaves the reader with very little information about it. Please read the forum FAQ for excellent advice on how to post clear questions that include the information necessary to get an answer. In particular, you tell us about an error message you got, but you don't show the actual command that provoked it You give a lengthy narrative description of your data, but it is vague, and, in any case, even the best prose description of a data set is no substitute for an example.

    I'm confident that if you repost, using the -dataex- command to show an example of your data, showing the code you have attempted so far (between code delimiters) and the output that Stata has given you (and, if it isn't obvious, why the results are not what you want), that you will get a timely and helpful response.

    In the future, when showing data examples, please use the -dataex- command to do so. If you are running version 15.1 or a fully updated version 14.2, it is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    When asking for help with code, always show example data. When showing example data, always use -dataex-.

    Comment


    • #3
      Try this:

      foreach n of numlist 1800/2000{
      bysort COLOUR: egen title_count_`n'=count(TITEL) if YEAR==`n'
      bysort COLOUR: egen title_count_`n'_max=max(title_count_`n')
      replace title_count_`n'=title_count_`n'_max if missing(title_count_`n')
      replace title_count_`n'=0 if missing(title_count_`n')
      drop title_count_`n'_max
      }
      bysort COLOUR: gen unique_color=_n
      foreach v in blue yellow green{
      foreach m of numlist 1810/2000{
      local i=`m'-10
      egen `v'_moveavg_10_`m'=rowmean(title_count_`i'-title_count_`m') if unique_color==1 & color=="`v'"
      }
      }

      You'll have to type in each value of COLOUR in the "foreach v in" line

      This example is for a 10 year moving average. If you want to change how many years the moving average takes into account, change the number 10 to any number of years you want in the last 3 lines before the end brackets.

      Hope this helped.

      Comment


      • #4
        Dear Mr. Schechter,

        thank you very much for your helpful advice.

        The Stata command I used was:
        tsset colours year1

        The reply I got was:
        repeated time values within panel
        r(451);

        Here is an example of my data:
        (I am using Stata 14)

        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input int year1 str397 Titel long colours
        1894 "Erdbeerkuchen"      1
        1896 "Rabarbarkuchen"     3
        1890 "Blaubertorte"       2
        1892 "Himbeersorbet"      6
        1895 "Tiramisu"           5
        1892 "Avokadocreme"       1
        1892 "Apfelkuchen"        1
        1892 "Streuselkuchen"     3
        1891 "Stachelbeereis"     4
        1866 "Schokoladenkuchen"  7
        1880 "Banaenbrot"         7
        1884 "Pfannkuchen"        3
        1896 "Ananaskuchen"       8
        1889 "Eistorte"           2
           . "Baise"              9
           . "Linzer Torte"      10
        
        end
        label values colours colours
        label def colours 1 "blau", modify
        label def colours 2 "gelb", modify
        label def colours 3 "grün", modify
        label def colours 4 "himmelblau", modify
        label def colours 5 "lila", modify
        label def colours 6 "orange", modify
        label def colours 7 "rosa", modify
        label def colours 8 "rot", modify
        label def colours 9 "schwarz", modify
        label def colours 10 "weiß", modify
        Your support is already appreciated.

        Best,
        Rike Lich

        Comment


        • #5
          Dear Mr. Westreich,

          thank you very much for your code! Unfortunately I got an error reply from Stata:

          type mismatch
          r(109);

          Do you have any idea, why this happened?

          Best,
          Rike Lich

          Comment


          • #6
            #4 ... because it's true:

            Code:
            1892 "Avokadocreme"       1
            1892 "Apfelkuchen"          1
            Same year; same colour, more than one observation. Fatal to tsset.

            Not fatal to your project, however. You could just contract your data, but something like

            Code:
            ssc inst rangestat
            gen byte foo = 1
            rangestat (sum) foo, by(colours) int(year -9 0)
            may help indicate some better technique. There should not be any need for loops or the creation of hundreds of new variables.

            Comment


            • #7
              It's possible that the egen count command will only work with numbers and not strings. Try running

              gen TITEL_1=1 if !missing(TITEL)

              and then replace TITEL with TITEL_1 everywhere in the code I wrote earlier.

              Did any of the code run before you got that error? If so, after which line did it stop working? I know this is a long shot, but sometimes when there is an error with a loop, running set trace on before the loop can shed light on the problem (and sometimes it does nothing at all to help).

              Comment


              • #8
                Jay Westreich : It is best if you put code between CODE delimiters see: https://www.statalist.org/forums/help#stata
                I've had multiple instances where copy & pasting code that is not in a "CODE" box created strange behavior in Stata even though it looked fine.
                Stata/MP 14.1 (64-bit x86-64)
                Revision 19 May 2016
                Win 8.1

                Comment


                • #9
                  Code:
                  foreach n of numlist 1800/2000{
                  bysort COLOUR: egen title_count_`n'=count(TITEL) if YEAR==`n'
                  bysort COLOUR: egen title_count_`n'_max=max(title_count_`n')
                  replace title_count_`n'=title_count_`n'_max if missing(title_count_`n')
                  replace title_count_`n'=0 if missing(title_count_`n')
                  drop title_count_`n'_max
                  }
                  bysort COLOUR: gen unique_color=_n
                  foreach v in blue yellow green{
                  foreach m of numlist 1810/2000{
                  local i=`m'-10
                  egen `v'_moveavg_10_`m'=rowmean(title_count_`i'-title_count_`m') if unique_color==1 & color=="`v'"
                  }
                  }

                  Comment


                  • #10

                    Jay Westreich

                    Thank you very much!

                    I just ran the CODE and now it worked it produced results like this:

                    (14 missing values generated)
                    (347,349 real changes made)
                    (14 real changes made)
                    (347576 missing values generated)
                    (14 missing values generated)
                    (347,562 real changes made)
                    (14 real changes made)
                    (346946 missing values generated)
                    (407 missing values generated)
                    (346,539 real changes made)
                    (407 real changes made)


                    And then an error occurred:

                    bysort COLOUR: gen unique_color=_n

                    . foreach v in blue {
                    2. foreach m of numlist 1810/2000{
                    3. local i=`m'-10
                    4. egen `v'_moveavg_10_`m'=rowmean(title_count_`i'-title_count_`
                    > m') if unique_color==1 & color=="`v'"
                    5. }
                    6. }
                    type mismatch
                    r(109);

                    I am sorry, but to be honest I am completely lost with this, I am still a beginner to Stata and not very experienced. Therefore, I don´t really know what exactly I am doing with this command and I would prefer the easier approach by Nick Cox. But I highly appreciate your help!!!!

                    Therefore, I would like to ask Nick Cox what exactly is

                    ssc inst rangestat
                    gen byte foo = 1
                    rangestat (sum) foo, by(colours) int(year -9 0)

                    I don´t understand what is happening there, what is foo_sum and how do I generate a table and a graph out of this? I am sorry, if my question is a bit basic...
                    Best regards and already thank you very much, Rike Lich

                    Comment


                    • #11
                      What's 1 + 1 + 1? 3, clearly. Equally if you sum 1s. you're counting how many observations you have.

                      But if what you want is really a table, then surely

                      Code:
                      contract colours years 
                      tabplot colours years [w=_freq], xasis showval
                      is simpler yet as a start

                      Comment


                      • #12
                        #11 referred to tabplot, which is from the Stata Journal. You don't absolutely need it here.


                        Code:
                        contract colours years  
                        bysort colours : tab year [w=_freq]
                        might be an easier way in.

                        Comment


                        • #13
                          Dear Mr. Cox, Dear Mr. Westreich, Dear Mr. Wilson, Dear Mr. Schechter,

                          I rearranged your command and now it worked out for me:
                          Here is my final solution:

                          ssc inst rangestat
                          gen byte colour10 = 1
                          rangestat (sum) colour10, by(colour) int(year -9 0)
                          xtline colour10_sum , t(year) i(colour) overlay

                          Thank you very much for your help!!! I got a nice graph!

                          Comment


                          • #14
                            Rike: Thanks for the closure. Glad you solved your problem.

                            By the way, I have never met Carole but I understand Carole to be female. Here it's simplest and best to use the names people use themselves, not use titles, because it's quite easy to get them wrong.

                            Comment


                            • #15
                              Oh, you are right, I am sorry @Carole J. Wilson.

                              Comment

                              Working...
                              X