Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Working with dates and loop

    Is it possible to get a loop that works with every string variable which are in date format (CCYY-NN-DD but string) to destring them?

    For example :

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str10 Da byte Bod str10 Pre
    "2020-04-17" 12 "2020-04-17"
    "2020-05-22" 10 "2020-05-22"
    "2020-09-17" 23 "2020-09-17"
    "2020-04-21" 24 "2020-04-21"
    end
    I would like to get my 2 dates variables in this format : %tdDD-NN-CCYY (but i need a loop because i've tons of date variables in my database)

    Thanks you !

  • #2
    Install the user contributed -numdate-, and try the following:

    Code:
    . foreach var of varlist Da Pre {
      2. numdate daily `var'Date = `var', pattern(YMD)
      3. }
    
    . list, sep(0)
    
         +-------------------------------------------------------+
         |         Da   Bod          Pre      DaDate     PreDate |
         |-------------------------------------------------------|
      1. | 2020-04-17    12   2020-04-17   17apr2020   17apr2020 |
      2. | 2020-05-22    10   2020-05-22   22may2020   22may2020 |
      3. | 2020-09-17    23   2020-09-17   17sep2020   17sep2020 |
      4. | 2020-04-21    24   2020-04-21   21apr2020   21apr2020 |
         +-------------------------------------------------------+

    Comment


    • #3
      It's possible, but it is better to use other commands to do the loop for you. The "format" you say the variables are now in is (not yet) known to Stata but just discernible by an intelligent user. That is the small problem here.

      Note that destring is quite definitely not what to do here. It will at best just make the job of processing these dates more time-consuming.

      https://www.stata-journal.com/articl...article=dm0098 says much more.

      Unfortunately, one of the examples in the manual entry for destring is misleading. It shows the command being used to remove spaces from a string variable containing dates, but all that does is make the later conversion to a numeric date variable more difficult. (I am credited fairly as an original author of destring but I didn't write that example.)

      There are various ways to do it. Here I use
      findname from the Stata Journal.

      If you
      search as here, you will get several results: all that might matter is to install the latest version (from SJ 20(2) as I write but anyone reading this much later might see a later update) and to have a look at the original write-up in 2010 if desired.

      Code:
      . search findname, sj
      
      Search of official help files, FAQs, Examples, and Stata Journals
      
      SJ-20-2 dm0048_4  . . . . . . . . . . . . . . . . Software update for findname
              (help findname if installed)  . . . . . . . . . . . . . . .  N. J. Cox
              Q2/20   SJ 20(2):504
              new options include columns()
      
      SJ-15-2 dm0048_3  . . . . . . . . . . . . . . . . Software update for findname
              (help findname if installed)  . . . . . . . . . . . . . . .  N. J. Cox
              Q2/15   SJ 15(2):605--606
              updated to be able to find strL variables
      
      SJ-12-1 dm0048_2  . . . . . . . . . . . . . . . . Software update for findname
              (help findname if installed)  . . . . . . . . . . . . . . .  N. J. Cox
              Q1/12   SJ 12(1):167
              correction for handling embedded double quote characters
      
      SJ-10-4 dm0048_1  . . . . . . . . . . . . . . . . Software update for findname
              (help findname if installed)  . . . . . . . . . . . . . . .  N. J. Cox
              Q4/10   SJ 10(4):691
              update for not option
      
      SJ-10-2 dm0048  . . . . . . . . . . . . . .  Speaking Stata: Finding variables
              (help findname if installed)  . . . . . . . . . . . . . . .  N. J. Cox
              Q2/10   SJ 10(2):281--296
              produces a list of variable names showing which variables
              have specific properties, such as being of string type, or
              having value labels attached, or having a date format


      Here's one way that you can find your date variables, with a criterion that the first four characters are always between 1980 and 2020. (1980 I just pluck out of the air; you may need an earlier date). Then you can loop over those variables.

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input str10 Da byte Bod str10 Pre
      "2020-04-17" 12 "2020-04-17"
      "2020-05-22" 10 "2020-05-22"
      "2020-09-17" 23 "2020-09-17"
      "2020-04-21" 24 "2020-04-21"
      end
      
      findname, all(inrange(substr(@, 1, 4), "1980", "2020")) local(wanted)
      
      edit `wanted'
      
      foreach v of local wanted { 
          gen n_`v' = daily(`v', "YMD")
          format n_`v' %tdDD-NN-CCYY 
      }
       
      list
      Your problem could be simpler than this. For example, if you had no string variables other these date variables. then a command you've already been told about


      Code:
      ds, has(type string)
      is going to be sufficient to get all the names together.

      Even simpler: if you only had two date variables, as in your example, then go directly to a loop


      Code:
      foreach v in Da Pre { 
          gen n_`v' = daily(`v', "YMD")
          format n_`v' %tdDD-NN-CCYY 
      }
      .

      Comment


      • #4
        Ty !
        I found this solution. It's a bit longer but it works. I gen a new variable *_2 for each variable, then replace those new variable with the function date. Every nondate variable has his *_2 with all missing values so i can drop if _2 has no value. It work but i'll try yours ! Thanks !

        Code:
         
        ds, has(type string)
        foreach x in `r(varlist)'{
        gen `x'_2=.
        order `x'_2,after(`x')
        }
        
        foreach x in `r(varlist)'{
        replace `x'_2=date(`x',"YMD")
        format `x'_2 %tdDD-NN-CCYY date_modif_2
        }
        
        ds *_2
        foreach var in `r(varlist)' {
        capture assert mi(`var')
        if !_rc {
        drop `var'
        }
        }

        Comment


        • #5
          Even with that brute force approach in #4 you only need one loop, something more like this.


          Code:
          ds, has(type string)
          foreach x in `r(varlist)'{
              gen `x'_2 = date(`x',"YMD")
              quietly count if missing(`x'_2)
              if r(N) == _N drop `x'_2
              else { 
                  order `x'_2, after(`x')
                  format `x'_2 %tdDD-NN-CCYY
              }
          }
          Last edited by Nick Cox; 14 Oct 2020, 04:03.

          Comment


          • #6
            Thanks that's perfect !!

            Comment


            • #7
              Thanks for the thanks, but #5 is not perfect at all. Let's back up: you got conflicting advice because of some vagueness in your problem.

              Joro Kolev correctly saw two date variables in your precise example so wrote code for that set-up. The onus was on you to come back with a better statement of your problem if it was really more complicated..

              My answer was a different style, which doesn't make mine right and @Joro's wrong. In fact, it is good discipline to answer the question asked and let the OP come back. But you seemed to be hinting that you have many more string variables than date variables, and perhaps hinting that you have many date variables than two, so I gave you one way of being smart about homing in on the date variables.

              If #5 or something like it "worked", good. But it is not good style. It's brute force.

              Also, a good way to pose the question might have been (say) I have about 10 date variables which are string, but I have many more string variables. (And so on.)

              Comment


              • #8
                Yes, you're right.
                In my database there is something like 20 date variables and 100 other variables (with many strings variables)
                I put an example but yes I should have said more about my database.
                I tried your solution with findname and it didn't work. I try to see why

                Ok, i think that's because i've some missings values in my date variables (sorry i didn't see that)
                like that :

                Code:
                * Example generated by -dataex-. To install: ssc install dataex
                clear
                input str10 Da byte Bod str10 Pre str2 var4
                "2020-04-17" 12 "2020-04-17" "10"
                "2020-05-22" 10 "."          "." 
                "2020-09-17"  . "2020-09-17" "34"
                "."          24 "2020-04-21" "20"
                end
                Last edited by Raph Selenite; 14 Oct 2020, 06:26.

                Comment


                • #9
                  OK, but I can't comment helpfully without more detail on "didn't work". The possibilities range from (a) you applied it wrongly through (b) there are complications in your data my example doesn't cope with to (c) you found a bug in findname.

                  Comment


                  • #10

                    findname doesn't generate an error but just worked on only 3 variables. And i guess that's because i don't have any missings values on thoses 3 variables (and all others date variables have it).
                    But i really don't know
                    Last edited by Raph Selenite; 14 Oct 2020, 06:59.

                    Comment


                    • #11
                      The code in #3 won't fall over just because there are missing values.

                      Comment


                      • #12
                        So i really don't know why it doesn't work
                        Thanks you ! and sorry for the inconvenience

                        Comment

                        Working...
                        X