Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • gen dates greater than ...

    Hi all,

    I have data from 04jan1999 to 11apr2016

    I would like to have 4 periods : First: 01/04/1999 à 10/12/2004

    Second: 10/12/2004 à 29/12/2008

    Third : 29/12/2008 à 08/12/2014

    Fourth : 08/12/2014 à 11/04/2016

    At the beginning, I simply create 4 different files by using the drop function :
    Code:
     
     drop if date1 > date("20041210","YMD")
    etc.

    But the problem is that it's not very practical to always loading different files.

    I would like to regroup all the periods on one .dta file.
    I would like to have date0 which is the data from 04jan1999 to 11apr2016, date1 which is the data for first period and so on until date4

    So I tried to generate my periods by doing this :


    Code:
    gen date1 = date<td(10dec2004)
    
    format date1 %td
    But it doesn't works.

    So I guess I am wrong, and I can't find on internet a topic which is explaining how to create "periods".

    Maybe this can help :

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str10 date double(crsoil crsru) byte _merge float date1
    "04/01/1999" 10.94 25.2875 3 14248
    "05/01/1999"  10.3 26.5876 3 14249
    "06/01/1999" 10.67 27.4315 3 14250
    "07/01/1999" 11.08 26.9876 3 14251
    "08/01/1999"  11.7 27.2075 3 14252
    "11/01/1999" 12.07 27.0148 3 14255
    "12/01/1999" 11.78 26.6366 3 14256
    "13/01/1999"  10.9 26.4489 3 14257
    "14/01/1999"  11.1 26.5723 3 14258
    "15/01/1999" 10.97 26.5662 3 14259
    "18/01/1999" 10.81 27.2075 3 14262
    "19/01/1999" 11.19 26.7599 3 14263
    "20/01/1999" 10.85 26.8922 3 14264
    "21/01/1999" 11.14 26.6122 3 14265
    "22/01/1999" 11.23 27.0034 3 14266
    "25/01/1999" 11.21 26.8714 3 14269
    "26/01/1999" 10.84 26.6191 3 14270
    "27/01/1999" 11.03 26.5778 3 14271
    "28/01/1999" 11.14 26.1643 3 14272
    "29/01/1999" 11.34 26.3482 3 14273
    "01/02/1999" 10.81 26.2429 3 14276
    "02/02/1999" 10.39 26.4356 3 14277
    "03/02/1999" 10.78 26.5368 3 14278
    "04/02/1999" 10.42 26.1337 3 14279
    "05/02/1999" 10.18 26.2495 3 14280
    end
    format %tdDD/NN/CCYY date1
    label values _merge _merge
    label def _merge 3 "matched (3)", modify

    Thank you for your time,


  • #2
    I'm not sure I understand what you want. This is what I think you are saying: you have a date variable in your data, and you would like to create a new variable, let's call it era, which is set to 1 for dates between 01jan1999 and 10dec2004, 2 for dates between 10dec2004 and 29dec2008, 3 for dates between 29dec2008 and 29dec2014, and 4 for dates between 29dec2014 and 11apr2016.

    Now, there is a problem here: you have specified two different values to use for any observations whose dates are 10dec2004, 29dec2008, or 29dec2014. So you need to decide for each of these boundary points whether they belong to the earlier group or the later group that they bound. For illustration here, I assume they go in the earlier group.

    Code:
    gen byte era = 1 if inrange(date, td(01jan1999), td(10dec2004))
    replace era = 2 if inrange(date, td(11dec2004), td(29dec2008))
    replace era = 3 if inrange(date, td(30dec2008), td(29dec2014))
    replace era = 4 if inrange(date, td(30dec2014), td(11apr2016))
    I hope this helps.

    Comment


    • #3
      Note that date in the data example in #1 is a string variable. Clyde's approach looks good so long as it is applied to a numeric daily date variable, which naturally is entirely his intent.

      More generally, the names in the code and the names in the data example don't match up in every case. For example, date1 in the data is a numeric daily date variable, but the code implies that it is a (0, 1) indicator, or non-existent,

      Morad should tell us a consistent story.

      Comment


      • #4
        wow ... Okay, hmmm ...

        Sorry for the answer time. I am not native english speaker, it's hard for me to understand it all on first reading.

        So, I will try to explain more clearly what I want.

        I have the evolution of the oil prices, and of the russian rouble. From 04jan1999 to 11apr2016.
        So for each date, I have the value of oil and of ruble.

        I want to study some periods more carefully. So I need to either drop all the dates who aren't in these periods. Or, create a variable "date_i" for each period i.

        I would prefer the second option, because I can regroup it all in one .dta file.

        Comment


        • #5
          In other words, how can I "isolate" some dates ? let's be very simple. For example :

          I have :
          date crsoil crsru
          04/01/1999 10.94 25.2875
          05/01/1999 10.3 26.5876
          06/01/1999 10.67 27.4315
          07/01/1999 11.08 26.9876
          08/01/1999 11.7 27.2075
          11/01/1999 12.07 27.0148

          But I would like to have :
          date crsoil crsru date1 date2
          04/01/1999 10.94 25.2875 04/01/1999 07/01/1999
          05/01/1999 10.3 26.5876 05/01/1999 08/01/1999
          06/01/1999 10.67 27.4315 06/01/1999 11/01/1999
          07/01/1999 11.08 26.9876
          08/01/1999 11.7 27.2075
          11/01/1999 12.07 27.0148

          Here I have split my first period : from 04/01/1999 to 11/01/1999
          Into two periods "date1" and "date2"

          Comment


          • #6
            Okay it looks ugly, let's try with this :
            date0 crsoil crsru date1 date2 date3
            04/01/1999 10.94 25.2875 04/01/1999 11/01/1999 15/01/1999
            05/01/1999 10.3 26.5876 05/01/1999 12/01/1999 18/01/1999
            06/01/1999 10.67 27.4315 06/01/1999 13/01/1999 19/01/1999
            07/01/1999 11.08 26.9876 07/01/1999 14/01/1999
            08/01/1999 11.7 27.2075 08/01/1999
            11/01/1999 12.07 27.0148
            12/01/1999 11.78 26.6366
            13/01/1999 10.9 26.4489
            14/01/1999 11.1 26.5723
            15/01/1999 10.97 26.5662
            18/01/1999 10.81 27.2075
            19/01/1999 11.19 26.7599
            Last edited by Morad Bali; 29 Apr 2016, 12:27.

            Comment


            • #7
              It's better.

              Okay I think you get the point. I want to split the general period into other periods.

              Like that I can draw graph etc. By using the variable "date1" etc. I don't need to each time open a new file ...

              THANK YOU !!!

              Comment


              • #8
                Now, I am even more confused about what you want. The dates in the date2 and date3 variables are not the dates you mentioned as the boundaries of the desired time periods earlier. Also, I don't see any systematic relationship between date2 and date1, nor between date3 and date1, nor between date2 and date3 in #6. Similarly, I do not understand how date2 is related to date1 in #5. On top of that, the values of date2 in #5 are different from those in #6, even when date1 is the same in both.

                I would still thank that my approach in #2 does what you want when you say
                I want to study some periods more carefully. So I need to either drop all the dates who aren't in these periods. Or, create a variable "date_i" for each period i.
                Perhaps to make it easier to see, you could do it this way:

                Code:
                gen byte era = 1 if inrange(date, td(01jan1999), td(10dec2004))
                replace era = 2 if inrange(date, td(11dec2004), td(29dec2008))
                replace era = 3 if inrange(date, td(30dec2008), td(29dec2014))
                replace era = 4 if inrange(date, td(30dec2014), td(11apr2016))
                label define era 1 "1jan1999-10dec2004" ///
                    2 "11dec2004-29dec2008" ///
                    3 "30dec2008-29dec2014" ///
                    4 "30dec2014-11apr2016"
                label values era era

                Comment


                • #9
                  I am sorry, I tried to create an example with other values. But yes it doesn't make it easier at all >>

                  I tried your code. When I write :
                  Code:
                  . gen byte era = 1 if inrange(date, td(01jan1999), td(10dec2004))
                  Stata say :
                  type mismatch

                  Comment


                  • #10
                    Right. As Nick pointed out in #3, the variable date needs to be a Stata internal numeric variable in order for my code to work. You have it as a string. So first do this:

                    Code:
                    gen _date = daily(date, "DMY")
                    drop date
                    rename _date date
                    format date %td
                    Then run my code from #8.
                    Last edited by Clyde Schechter; 29 Apr 2016, 12:59.

                    Comment


                    • #11
                      when I arrive to :
                      Code:
                      . label values era era
                      
                      . label define era 1 "1jan1999-10dec2004"
                      
                      . label define era 2 "11dec2004-29dec2008"
                      after the :
                      Code:
                      . label define era 2 "11dec2004-29dec2008
                      Stata says :
                      label era already defined
                      r(110);

                      Comment


                      • #12
                        That's because you didn't type the /// at the end of those lines that have it. When working with code, every detail is important. You have to use it exactly as it was written.

                        Start over and run:

                        Code:
                        gen _date = daily(date, "DMY")
                        drop date
                        rename _date date
                        format date %td
                        gen byte era = 1 if inrange(date, td(01jan1999), td(10dec2004))
                        replace era = 2 if inrange(date, td(11dec2004), td(29dec2008))
                        replace era = 3 if inrange(date, td(30dec2008), td(29dec2014))
                        replace era = 4 if inrange(date, td(30dec2014), td(11apr2016))
                        label define era 1 "1jan1999-10dec2004" ///
                            2 "11dec2004-29dec2008" ///
                            3 "30dec2008-29dec2014" ///
                            4 "30dec2014-11apr2016"
                        label values era era
                        (This is the same code provided earlier, all in one place.)

                        To make sure you don't make more mistakes, copy this code from here and paste it into your do-file editor and run it from there.

                        Comment


                        • #13
                          If Morad is attempting to run Clyde's code in the command line rather than using a do-file, then he can run 4 separate commands as follows:
                          Code:
                           label define era 1 "1jan1999-10dec2004" , modify
                           label define era   2 "11dec2004-29dec2008" , modify
                           label define era   3 "30dec2008-29dec2014" , modify
                           label define era   4 "30dec2014-11apr2016" , modify
                          label values era era
                          If he wants to create different date variables by era, then he can use:
                          Code:
                          separate date, by(era)
                          This will be similar to #6, but not exactly.
                          Stata/MP 14.1 (64-bit x86-64)
                          Revision 19 May 2016
                          Win 8.1

                          Comment


                          • #14
                            @Clyde SchechterThank you it works ! And yes it's because I tried to copy-paste it without the do file :/
                            @Carole J. WilsonThank you too, the code :
                            Code:
                            separate date, by(era)
                            put it all exactly the way I want.

                            Now, I would like to drop the "." in the period that I created (date1, 2, 3 and 4). Other way I think it will be a problem for the graph and calculs, no ?

                            I need to drop them without dropping the main "date".

                            Finally, I need it to be sorted from the older to the more recent. I guess I have to do :
                            Code:
                            format date %tdDD/NN/CCYY
                            sort date
                            list in 1/5
                            desc date date
                            for each date right ?

                            Thank you !!

                            Comment


                            • #15
                              Oh, I realise that the "." aren't a problem for the graphics. So I guess they aren't counted as "0" for the calculs right ?

                              In that case the last thing that I need to do is to sort the dates by order. Older to younger.
                              Last edited by Morad Bali; 29 Apr 2016, 14:35.

                              Comment

                              Working...
                              X