No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • Quarterly date variable

    Hi, I am struggling a bit with dates in Stata. I have a variables that has the form (199103, 199106, 199109, 199112, 199203, 199206....). The variable period represents the date for my quarterly data. The first four digits are the year and the last two digits are the quarter. I want to convert the variable to the format 1991q1, 1991q2, 199q3, 1991q4 and so on). Then, to tell Stata that it represents a date variable.

    Initially, the code
     format periods %tq
    results in periods being converted to a variable that looks exactly the same for all entries (e.g. 2.0e+05 )

    Can anyone help in that please?

    Thanks and Happy new year to you all.

  • #2
    I also tried
     gen  fqdate = quarterly(periods, "YQ")
    but I got an error message
     type mismatch
    I use dataex to show a sample of my data:

    * Example generated by -dataex-. To install: ssc install dataex
    input long period float gdp
    198312 11.7282
    198403 12.8756
    198406 10.8643
    198409  7.4104
    198412  6.0227
    198503  8.9046
    198506  6.3087
    198509  8.8616
    198512   5.432
    198603  5.7989
    198606  3.4901
    198609  5.7761
    198612  4.3823
    198703  5.8463
    198706  7.4011
    198709  6.7168
    198712 10.3538
    198803  5.5181
    198806  9.5237
    198809  7.2397
    198812  8.8219
    198903  8.7496
    198906  7.5118
    198909  6.0453
    198912  3.6774
    199003  9.1395
    199006  5.8199
    199009  3.7196
    199012  -.4107
    199103  2.1151
    199106  5.9897
    199109  4.9598
    199112  3.9753
    199203  6.6242
    199206  7.1751
    199209  5.9313
    199212  6.9198
    199303  3.0564
    199306   4.913
    199309  4.4413
    199312  7.6613
    199403  6.0179
    199406  7.6955
    199409  4.6172
    199412  6.9417
    199503  3.7209
    199506  3.1972
    199509   5.452
    199512  4.9152
    199603  4.8874
    199606  8.8111
    199609  4.9263
    199612  6.4296
    199703  5.6674
    199706  7.3245
    199709  6.7059
    199712  4.5155
    199803  4.6957
    199806  4.8089
    199809  6.9235
    199812  8.0702
    199903  5.3097
    199906  4.7351
    199909  6.6601
    199912  9.1004
    200003  4.2947
    200006 10.2321
    200009   3.114
    200012  4.5118
    200103  1.3744
    200106   5.053
    200109   .0414
    200112  2.3437
    200203  5.0687
    200206  3.7586
    200209   3.795
    200212  2.4393
    200303   4.628
    200306  5.1028
    200309  9.2542
    200312   6.761
    200403  5.9364
    200406  6.5967
    200409  6.2593
    200412  6.4405
    200503  8.2519
    200506  5.1019
    200509  7.3241
    200512   5.445
    200603  8.2327
    200606  4.4962
    200609  3.1882
    200612   4.619
    200703  4.8283
    200706  5.4212
    200709  4.1512
    200712  3.2117
    200803  -.4595
    200806  4.0019
    200809   .8126


    • #3
      The %tq format (and other Stata date display formats) will only work properly when the underlying variable is a correct Stata internal format numeric date --which yours are not. You have a numeric coding that is easy for humans to read, but not appropriate for Stata to work with.

      So first we have to convert these numbers to Stata internal format dates.

      * Example generated by -dataex-. To install: ssc install dataex
      input float lisa_date
      gen year = int(lisa_date/100)
      gen month = mod(lisa_date, 100)
      gen qdate = qofd(mdy(month, 1, year))
      format qdate %tq
      list, noobs clean


      • #4
        Thanks a lot Clyde. It worked very well. I always struggle with dates in Stata. Thank you


        • #5
          We all struggle, to some degree, in working with dates and times in Stata. If you have not already done so, you should thoroughly review the very detailed Chapter 24 (Working with dates and times) of the Stata User's Guide PDF. After that, the help datetime documentation will usually be enough to point the way. All Stata manuals are included as PDFs in the Stata installation (since version 11) and are accessible from within Stata - for example, through the PDF Documentation section of Stata's Help menu.

          Even with a few year's experience, I never write date and time code without checking with help datetime, if not when I'm writing the code, then when the code I've written fails, as it inevitably does when I haven't checked first.

          Added: Crossed with #6, where the much more experienced Clyde Schecter makes me feel better about always needing to check the documentation, by explaining how complicated the matter really is.
          Last edited by William Lisowski; 02 Jan 2017, 14:24.


          • #6
            I always struggle with dates in Stata.
            To some extent, we all do.

            At a very fundamental level, dates are inherently problematic because there are so many different ways of representing them in general use. We get our data sets from a variety of sources, and different sources often use different ways of representing dates. Even the same source often is inconsistent: I've seen plenty of data sets that mix-and-match dates in formats like 2jan2017, 1/2/17, and 20170102 all in the same dataset (and sometimes even in the same string variable)!

            When there are so many different ways of writing dates, and when computations with dates require a regularized, uniform approach, then necessarily the apparatus needed to navigate among the various representations is complicated. It has to be in order to have sufficient flexibility for the task. That's why it seems like Stata has a million different functions for going between different types of date representations. But that makes it hard to remember which function does what, and exactly what the syntax for each one is, even though Stata has taken a pretty systematic approach to the names and syntax it gives these functions.

            I think the fundamental thing that has to be remembered is that any calculations with dates in Stata requires the use of Stata internal format (SIF) dates, and that these SIF dates are counts of the number of time units from 1 Jan 1960 to the given date. (That is, a daily SIF date is the number of days from 1 Jan 1960; a quarterly SIF date is the number of quarters from 1 Jan 1960. A clock SIF datetime is the number of milliseconds from 1 Jan 1960 00:00:00.) So if you have a numeric variable that looks like a date to the human eye, you know right away it can't be right. It has to be a number that generates no immediate brain recognition as a date, and it must be of the right magnitude given the dates being represented and the unit of time involved.

            The chapter on datetimes in [D] is well written and has lots of examples. But it is, of course, impossible to remember all the details for long. Everyone who uses Stata with any regularity needs to read this chapter, and probably re-read it periodically as well. Fortunately -help datetime- is also very well organized and has lots of internal links to help you quickly track down the right function. So if you are familiar with the general concept and have read the manual chapter a few times, most of the time you can find what you need in the help file without too much difficulty. But I don't think even the most experienced among us can consistently handle dates without going back to the help files, and sometimes to the manual: we may get really good at handling a few specific types of date representations that come up most often in our work. But when we encounter something infrequently, memory just isn't adequate.

            All of that said, Nick Cox recently authored a program -numdate- which can be obtained from SSC. It's pretty good at "looking" at both string and human-readable-numeric dates and then figuring out the appropriate transformations for you. It's not full blown artificial intelligence, but it certainly handles a wide variety of cases with relatively little effort.

            Added: Crossed with #5, where William Lisowski makes most of the same points far more succinctly!


            • #7
              William Lisowski messged me privately to ask if I meant the chapter on datetimes in [U] (rather than in [D] which is almost the same as what's in -help datetimes-.) He is indeed right. Learning the basics from Chapter 24 Working with dates and times in [U] is the best starting point. Sorry for the misdirection.


              • #8
                All the advice here is excellent. But because there are so many date functions in Stata, there is usually more than one way to do it. Here's another, a variant on one of Clyde's that just requires noticing that 1 2 3 4 are 3 6 9 12 divided by 3.

                input float lisa_date
                gen qdate = yq(floor(lisa_date/100), mod(lisa_date, 100)/3) 
                format qdate %tq 
                list , sep(0) 
                     | lisa_d~e    qdate |
                  1. |   199103   1991q1 |
                  2. |   199106   1991q2 |
                  3. |   199109   1991q3 |
                  4. |   199112   1991q4 |
                  5. |   199203   1992q1 |
                  6. |   199206   1992q2 |


                • #9
                  While Nick's code wins for brevity and elegance, Clyde's code wins for pedagogy.

                  Clyde's code instructs us in a general approach: initially convert whatever you are given into a SIF date of some form (a daily date in this example), then use Stata SIF-to-SIF conversion functions as required to convert that SIF date into the SIF date with the periodicity you need (a quarterly date in this example), and then apply the appropriate format to that result. He could have created a SIF monthly date as a more obvious starting point, but then would have had to convert it first to a SIF daily date before converting that to a quarterly date, since there is no function that goes directly from monthly to quarterly.

                  Clyde's approach is consistent with the principles discussed in his post at #6, especially, for me, dealing with the seemingly infinite varieties of dates and times with my most definitely (small) finite capacity for remembering fiddly details.
                  Last edited by William Lisowski; 04 Jan 2017, 13:03.