Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • show only first day of quarter

    please help me with this table

    Code:
                01jan2008
    23.17    02jan2008
    22.49    03jan2008
    23.94    04jan2008
    
    25.71    28mar2008
    25.61    31mar2008
    22.68    01apr2008
    23.43    02apr2008
    23.21    03apr2008
    22.45    04apr2008
    22.42    07apr2008
    22.36    08apr2008
    22.81    09apr2008
    21.98    10apr2008
    23.46    11apr2008
    23.82    14apr2008
    22.78    15apr2008
    
    21.58    19jun2008
    22.87    20jun2008
    22.64    23jun2008
    22.42    24jun2008
    21.14    25jun2008
    23.93    26jun2008
    23.44    27jun2008
    23.95    30jun2008
    23.65    01jul2008
    25.92    02jul2008
    24.78    03jul2008
                04jul2008
    25.78    07jul2008
    23.15    08jul2008
    25.23    09jul2008
    25.59    10jul2008
    27.49    11jul2008
    28.48    14jul2008
    28.54    15jul2008
    
    33.85    22sep2008
    35.72    23sep2008
    35.19    24sep2008
    32.82    25sep2008
    34.74    26sep2008
    46.72    29sep2008
    39.39    30sep2008
    39.81    01oct2008
    45.26    02oct2008
    45.14    03oct2008
    52.05    06oct2008
    53.68    07oct2008
    57.53    08oct2008
    63.92    09oct2008
    69.95    10oct2008
    54.99    13oct2008
    55.13    14oct2008
    
    ... 
    and other years
    in this table i should drop all, except only first day of quarter, and if first day is missing it should be replaced by second date
    23.17 01jan2008
    22.68 01apr2008
    23.65 01jul2008
    39.81 01oct2008
    etc 01jan2009


  • #2
    So, let's call the first variable v1 and the other variable date. I'm going to assume that this is a genuine numeric Stata internal date variable, not a string variable that humans read as dates, and not some other kind of number with value labels attached. You want to keep only those dates that represent the first date in each quarter

    Code:
    gen quarter = qofd(date)
    format quarter %tq
    
    by quarter (date), sort: keep if _n == 1
    This will leave you with only the earliest date in the quarter your data set contains.

    The way you posed your question leads me to wonder, though, whether you want to go farther and retain that date only if it is the first or second calendar date in the quarter (so that a quarter would go completely unrepresented if, say, the first date from the quarter in the data set is the third date in the quarter or later.) If that's what you want, then follow-up the above with the following:

    Code:
    drop if date-dofq(quarter) >= 2
    See online help for date and time functions, and the manual section for (a lot) more detail. The key here is the use of the qofd() function (and perhaps the dofq() function).

    Comment


    • #3
      Originally posted by Clyde Schechter View Post
      ...
      01jan2008 23.17 02jan2008
      small problem is in first jan, it is not working day and at all years obs missing. How replace obs value from 2nd jan to first

      and better if code can find all missing obs and replace by second obs
      Last edited by Eva Smith; 10 Mar 2016, 04:12.

      Comment


      • #4
        Clyde's post answered that already.

        Comment


        • #5
          Code:
          v1   date
                         01jan2008
          22.68    01apr2008
          23.65    01jul2008
          39.81    01oct2008
                      01jan2009
          42.28    01apr2009
          26.22    01jul2009
          28.27    01oct2009
                       01jan2010
          17.47    01apr2010
          32.86    01jul2010
          22.5    01oct2010
          17.4    01apr2011
          15.87    01jul2011
                       02jan2012
          15.64    02apr2012
          16.8    02jul2012
          16.32    01oct2012
                        01jan2013
          13.58    01apr2013
          16.37    01jul2013
          15.54    01oct2013
                      01jan2014
          13.1    01apr2014
          11.15    01jul2014
          16.71    01oct2014
                      01jan2015
          15.11    01apr2015
          16.09    01jul2015
          22.55    01oct2015
                      01jan2016
          i dont understand. i tried

          Code:
           
           drop if date-dofq(quarter) >= 2
          it deleted all obs after 2016q1, missing values did not replaced

          Code:
          v1          date    quarter
          16.6    02oct2013    2013q4
                     01jan2014    2014q1
          14.23    02jan2014    2014q1
          13.1    01apr2014    2014q2
          13.09    02apr2014    2014q2
          11.15    01jul2014    2014q3
          10.82    02jul2014    2014q3
          16.71    01oct2014    2014q4
          16.16    02oct2014    2014q4
                       01jan2015    2015q1
          17.79    02jan2015    2015q1
          15.11    01apr2015    2015q2
          14.67    02apr2015    2015q2
          16.09    01jul2015    2015q3
          16.79    02jul2015    2015q3
          22.55    01oct2015    2015q4
          20.94    02oct2015    2015q4
                       01jan2016    2016q1

          Comment


          • #6
            I think there has been a misunderstanding. When Eva referred to "missing," I interpreted that to mean that there was no observation with that date. It now appears she means that v1 has a missing value for that observation. And when that happens, she wants to use the first non-missing value for the quarter. So I think what she wants is this:

            Code:
            drop if missing(v1)
            gen quarter = qofd(date)
            format quarter %tq
            
            by quarter (date), sort: keep if _n == 1
            This will leave her with a reduced data set containing one observation per quarter. That observation will be the earliest date in the quarter that has a non-missing value for v1.

            It is not clear to me whether she also wants to change the value of date to be the first day of that quarter, even though the value of v1 actually comes from a later date. If she does want to do that, she can follow the above code with:

            Code:
            gen new_date = dofq(quarter)
            format new_date %td

            Comment

            Working...
            X