Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • closest date to another date variable

    Hi, i have a table with announcement date (anndat) and statistical period which is counts as a forecast period(statper) and i need to keep only statper that is closest to announcement date (anndat). How can I do this? I am a bit confused about the code i need to use.

    Thank you in advance.

    Olga
    id anndat statper
    1 14Feb2014 17apr2014
    1 7May2014 19Sep2014
    1 06Aug2014 17Nov2014
    2 30Jan2015 17Sep2015
    2 28Oct2015 18Oct2015

  • #2
    Welcome to the Stata Forum/ Statalist,

    Please provide data under code delimiters, as recommended in the FAQ.

    That said, you may generate a variable which is the difference between both dates. After that, you can - keep if - according to the appropriate range.
    Best regards,

    Marcos

    Comment


    • #3
      Olga:
      as per Marcos' wise recommendation, please post more effectively, giving useful details to those interested in helping you out.
      Admittedly, I'm not perfectly clear with what you're after. Hence, please consider what follows a guess-work offspring:
      Code:
      . gen double anndat_num = date( anndat , "DMY")
      
      . format anndat_num %td
      
      . gen double statper_num = date( statper , "DMY")
      
      . format statper_num %td
      
      . g diff= anndat_num- statper_num
      
      . bysort id: egen min_diff=max(diff)
      
      . list
      
           +----------------------------------------------------------------------+
           | id      anndat     statper   anndat_~m   statper~m   diff   min_diff |
           |----------------------------------------------------------------------|
        1. |  1   14Feb2014   17apr2014   14feb2014   17apr2014    -62        -62 |
        2. |  1    7May2014   19Sep2014   07may2014   19sep2014   -135        -62 |
        3. |  1   06Aug2014   17Nov2014   06aug2014   17nov2014   -103        -62 |
        4. |  2   30Jan2015   17Sep2015   30jan2015   17sep2015   -230         10 |
        5. |  2   28Oct2015   18Oct2015   28oct2015   18oct2015     10         10 |
           +----------------------------------------------------------------------+
      
      .
      Kind regards,
      Carlo
      (Stata 18.0 SE)

      Comment


      • #4
        Carlo Lazzaro naturally meant to use the min() function of egen to get the minimum,

        Comment


        • #5
          Nick:
          in all likelihood, I've reversed the terms in the subtraction reported in my previous reply.
          As you spotted, it makes more sense to go as follows:
          Code:
          . g diff_bis= statper_num-anndat_num
          
          . bysort id: egen min_diff_bis=min(diff_bis)
          
          . list
          
               +--------------------------------------------------------------------------------------------+
               | id      anndat     statper   anndat_~m   statper~m   diff   min_diff   diff_bis   min_di~s |
               |--------------------------------------------------------------------------------------------|
            1. |  1   14Feb2014   17apr2014   14feb2014   17apr2014    -62        -62         62         62 |
            2. |  1    7May2014   19Sep2014   07may2014   19sep2014   -135        -62        135         62 |
            3. |  1   06Aug2014   17Nov2014   06aug2014   17nov2014   -103        -62        103         62 |
            4. |  2   30Jan2015   17Sep2015   30jan2015   17sep2015   -230         10        230        -10 |
            5. |  2   28Oct2015   18Oct2015   28oct2015   18oct2015     10         10        -10        -10 |
               +--------------------------------------------------------------------------------------------+
          
          .
          Kind regards,
          Carlo
          (Stata 18.0 SE)

          Comment


          • #6
            Sorry, I was just looking at the bottom line. That said, on a second look I have to suggest that "closest" implies working with min(abs(difference)) unless there is a substantive reason otherwise.

            Comment


            • #7
              As someone else said in England some years ago: "I agree with NIck".
              Kind regards,
              Carlo
              (Stata 18.0 SE)

              Comment


              • #8
                You're referring to a television debate before the 2010 election in Britain. Agreeing with Nick [Clegg] didn't have very good long-term consequences for anybody concerned.

                Comment


                • #9
                  Nick:
                  yes, you're correct.
                  Luckily enough, the quantitative Nick is a source of everlasting inspiration and good consequences in the short run, too!
                  Kind regards,
                  Carlo
                  (Stata 18.0 SE)

                  Comment


                  • #10
                    Carlo and Nick than you both for your help, code worked perfectly! Just one more question how can i keep only lines with smallest difference between dates for each id?
                    So it shout look like this :
                    id anndat statper diff
                    1 14Feb2014 17apr2014 -62
                    2 28Oct2015 18Oct2015 10

                    Comment


                    • #11
                      Olga:
                      Code:
                       . sort id diff
                      
                      . bysort id: keep if _n==_N
                      should do the trick.
                      Last edited by Carlo Lazzaro; 11 Jan 2020, 02:55.
                      Kind regards,
                      Carlo
                      (Stata 18.0 SE)

                      Comment


                      • #12
                        The smallest difference will be sorted to first observation in eac panel, not the last.

                        Comment


                        • #13
                          Nick:
                          the complete code is reported below:
                          Code:
                          input id str20 anndat str20 statper
                          1 14Feb2014 17apr2014
                          1 7May2014 19Sep2014
                          1 06Aug2014 17Nov2014
                          2 30Jan2015 17Sep2015
                          2 28Oct2015 18Oct2015
                          end
                          . gen double anndat_num = date( anndat , "DMY")
                          . format anndat_num %td
                          . gen double statper_num = date( statper , "DMY")
                          . format statper_num %td
                          . g diff= anndat_num- statper_num
                          bysort id: egen min_diff=min(abs(diff))
                          sort id diff
                          list
                          
                              +----------------------------------------------------------------------+
                               | id      anndat     statper   anndat_~m   statper~m   diff   min_diff |
                               |----------------------------------------------------------------------|
                            1. |  1    7May2014   19Sep2014   07may2014   19sep2014   -135         62 |
                            2. |  1   06Aug2014   17Nov2014   06aug2014   17nov2014   -103         62 |
                            3. |  1   14Feb2014   17apr2014   14feb2014   17apr2014    -62         62 |
                            4. |  2   30Jan2015   17Sep2015   30jan2015   17sep2015   -230         10 |
                            5. |  2   28Oct2015   18Oct2015   28oct2015   18oct2015     10         10 |
                               +----------------------------------------------------------------------+
                          
                          .
                          I'm not clear whether the original poster wants to keep, for each panel, the observation with the smallest difference between the the two time variables (and, if that were the aim of her research, I fail to get what the calculation of the minimum per panel is useful for) or else.
                          Obviously, I could well missed out on something as the thread went on.
                          Kind regards,
                          Carlo
                          (Stata 18.0 SE)

                          Comment

                          Working...
                          X