Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Drop conditional on a date DDMMYYYY format

    Hi all,

    I have patent data where the date is given in DDMMYYY form. For example, 06jun2006. Any idea how to drop observations which, for example, are dated before 01jan2001? It is currently in int storage type (not sure if that is relevant).


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(uniquetardeal_id deal_id) double patent_tar int gdate_tar
    1 1 7056957 16958
    1 1 6491950 15684
    1 1 6746691 16230
    1 1 6020305 14641
    1 1 6676967 16083
    1 1 6596308 15908
    1 1 7056494 16958
    1 1 6155251 14949
    1 1 6406715 15509
    1 1 6818229 16391
    end
    format %d gdate_tar
    Kind regards,
    Chris

  • #2
    Code:
    drop if gdate_tar < mdy(1, 1, 2001)
    is one way to do it.

    Strictly the data are given to Stata as integers. A display format is neither here nor there for solving this, as the same display format applies to all the values in the variable, whereas you want to drop some of them.

    Comment


    • #3
      try this (and many other ways also)
      Code:
      drop if gdate_tar<date("01jan2001","DMY")
      in general, reviewing
      Code:
      help datetime
      would be very useful

      Comment


      • #4
        Thank you both very much!

        Comment


        • #5
          A quick second question. If I wanted to see how many patents were granted to a firm in the 5 years following a M&A deal date, say deal_date, what would be the best method? I thought of maybe creating a dummy variable for each patent to equal 1 if the patent was issued (gdate_tar) in the following 5 years by
          Code:
          gen 5years = .
          replace 5years = 1 if gdate_tar > deal_date & gdate_tar < deal_date5
          where deal_date5 would be the date 5 years after the deal, which, for now, I can only see if I were to manually change the year of each deal_date to 5 years later. I would then have to sum all of the 5years variables which equal 1 to find out how many patents were issued.
          I feel like there is surely an easier way to do this?

          Thanks!
          Chris

          Comment


          • #6
            Let's assume you have a variable deal_date that is constant within deals. You can create a variable sum5yr (note that your proposed variable name 5years is illegal in Stata) with the following code. I don't know whether these calculations should be done within both uniquetardeal_id and deal_id or just one of them--adjust as needed.

            The code uses -egen total- with two -cond()- functions to count the number of cases that where ((gdate_tar - deal_date)/365) is less than 5 and greater than 0. See help egen and help cond()

            Code:
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input float(uniquetardeal_id deal_id) double patent_tar int gdate_tar float deal_date
            1 1 6020305 14641 14976
            1 1 6155251 14949 14976
            1 1 6406715 15509 14976
            1 1 6491950 15684 14976
            1 1 6596308 15908 14976
            1 1 6676967 16083 14976
            1 1 6746691 16230 14976
            1 1 6818229 16391 14976
            1 1 7056494 16958 14976
            1 1 7056957 16958 14976
            end
            format %d gdate_tar
            format %td deal_date
            
            bysort uniquetardeal_id deal_id: egen sum5yr=total( cond( ((gdate_tar-deal_date)/365)<5 , 1, 0, 0) & cond( ((gdate_tar-deal_date)/365)>0 , 1, 0, 0))
            Stata/MP 14.1 (64-bit x86-64)
            Revision 19 May 2016
            Win 8.1

            Comment


            • #7
              Hi Carole,

              Thanks for your reply! That seems to be a good idea. I'm getting a type mismatch with that command. I've tried figuring out if there was an error with the syntax but can't seem to figure it out. Any ideas?

              Comment


              • #8
                Possibly one of the variables gdate_tar or deal_date is a string variable in your dataset.
                If that doesn't help, please post a self-contained example leading to the error message, with dataex example and exact code used.

                Comment


                • #9
                  Hi Jorrit,

                  You're right. Both are string variables. If I destring them, they become missing values. Any idea how to do this while still using dates? Maybe convert them to 3 seperate variables with day, month, year, and then do another conditional egen?

                  Comment


                  • #10

                    Code:
                    * Example generated by -dataex-. To install: ssc install dataex
                    clear
                    input long patent_acq str9(gdate_acq deal_date)
                    7418285 "26aug2008" "15dec2006"
                    5679573 "21oct1997" "15dec2006"
                    8063151 "22nov2011" "15dec2006"
                    8226681 "24jul2012" "15dec2006"
                    5654466 "05aug1997" "15dec2006"
                    5712250 "27jan1998" "15dec2006"
                    8148384 "03apr2012" "15dec2006"
                    7458483 "02dec2008" "15dec2006"
                    6204431 "20mar2001" "15dec2006"
                    5658893 "19aug1997" "15dec2006"
                    7303922 "04dec2007" "15dec2006"
                    6569157 "27may2003" "15dec2006"
                    8465758 "18jun2013" "15dec2006"
                    7612181 "03nov2009" "15dec2006"
                    5789219 "04aug1998" "15dec2006"
                    7011939 "14mar2006" "15dec2006"
                    8187320 "29may2012" "15dec2006"
                    7169888 "30jan2007" "15dec2006"
                    5915583 "29jun1999" "15dec2006"
                    5914328 "22jun1999" "15dec2006"
                    5932545 "03aug1999" "15dec2006"
                    7115591 "03oct2006" "15dec2006"
                    8597309 "03dec2013" "15dec2006"
                    5817798 "06oct1998" "15dec2006"
                    8382817 "26feb2013" "15dec2006"
                    7905999 "15mar2011" "15dec2006"
                    7891510 "22feb2011" "15dec2006"
                    6232455 "15may2001" "15dec2006"
                    8034926 "11oct2011" "15dec2006"
                    6818772 "16nov2004" "15dec2006"
                    5849336 "15dec1998" "15dec2006"
                    6124341 "26sep2000" "15dec2006"
                    7820812 "26oct2010" "15dec2006"
                    8193367 "05jun2012" "15dec2006"
                    6444817 "03sep2002" "15dec2006"
                    5654411 "05aug1997" "15dec2006"
                    6710185 "23mar2004" "15dec2006"
                    5512581 "30apr1996" "15dec2006"
                    5821074 "13oct1998" "15dec2006"
                    5616714 "01apr1997" "15dec2006"
                    6720166 "13apr2004" "15dec2006"
                    5989222 "23nov1999" "15dec2006"
                    5645838 "08jul1997" "15dec2006"
                    6201001 "13mar2001" "15dec2006"
                    7846170 "07dec2010" "15dec2006"
                    5869252 "09feb1999" "15dec2006"
                    6274385 "14aug2001" "15dec2006"
                    6080844 "27jun2000" "15dec2006"
                    5858986 "12jan1999" "15dec2006"
                    6080874 "27jun2000" "15dec2006"
                    end
                    That's my data. What I want to do is count how many of the patents were granted after the deal date. NB: I changed the name of gdate_tar to gdate_acq.

                    Comment


                    • #11
                      Somethign like:
                      Code:
                      ren deal_date deal_date_str
                      ren gdate_acq gdate_acq_str
                      gen gdate_acq = date(gdate_acq_str, "DMY")
                      gen deal_date = date(deal_date_str, "DMY")
                      format %d gdate_acq
                      format %d deal_date
                      gen grantedafter = 0
                      replace grantedafter = 1 if gdate_acq>deal_date
                      I'm not exactly clear on how you'd want the count. There's no group over which to sum in this example. If it is just what you wanted before, and what was answered by Carole, then just transform your string variables to Stata dates as shown here, and use what Carole advised

                      Comment


                      • #12
                        That worked perfectly! Much appreciated.

                        Comment

                        Working...
                        X