Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Need to count with several conditions

    I want to count how many contracts the company already had in the inventory when the company signs the new contract.

    Here is a part of the sample.
    id start date end date
    1 4/29/2008 7/10/2008
    1 11/8/2007 11/2/2008
    1 7/11/2008 12/11/2008
    1 4/10/2008 10/21/2008
    1 1/13/2009 4/9/2009
    1 9/6/2007 5/6/2008
    1 1/22/2009 6/11/2009
    2 7/21/2008 12/31/2008
    2 2/13/2008 2/13/2009
    2 5/17/2008 10/27/2009
    2 3/20/2008 9/11/2008
    2 2/13/2013 6/24/2013
    In the first row, the company A signed the contract on 4/29/2008 and finished the contract 7/10/2008. So, as of 4/29/2008, the company A already had 3 contracts: 1. 11/8/2007 ~ 11/2/2008, 2. 4/10/2008 ~ 10/21/2018, and 3. 9/6/2007 ~ 5/6/2008.

    So, I created two criteria to count how many contract the company already had when the company signs the new contract.
    1. other contracts' start dates must be earlier than the contract that the company wants to sign.
    2. other contracts' end date must be later than the contract that the company wants to sign. If other contracts' end dates are earlier than the current contract, those contracts were not in the inventory.

    I can count the number of contract, using excel function like "=COUNTIFS(B:B,"<"&B2,C:C,">"&B2,A:A,A2)".
    id start date end date count
    1 4/29/2008 7/10/2008 3
    1 11/8/2007 11/2/2008 1
    1 7/11/2008 12/11/2008 2
    1 4/10/2008 10/21/2008 2
    1 1/13/2009 4/9/2009 0
    1 9/6/2007 5/6/2008 0
    1 1/22/2009 6/11/2009 1
    2 7/21/2008 12/31/2008 3
    2 2/13/2008 2/13/2009 0
    2 5/17/2008 10/27/2009 2
    2 3/20/2008 9/11/2008 1
    2 2/13/2013 6/24/2013 0

    But, I don't know how to count in STATA. Please help me solve this problem.

    Thanks!

  • #2
    Not the best, but this help:


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte id str9 startdate str10 enddate
    1 "4/29/2008" "7/10/2008"
    1 "11/8/2007" "11/2/2008"
    1 "7/11/2008" "12/11/2008"
    1 "4/10/2008" "10/21/2008"
    1 "1/13/2009" "4/9/2009"  
    1 "9/6/2007" "5/6/2008"  
    1 "1/22/2009" "6/11/2009"
    2 "7/21/2008" "12/31/2008"
    2 "2/13/2008" "2/13/2009"
    2 "5/17/2008" "10/27/2009"
    2 "3/20/2008" "9/11/2008"
    2 "2/13/2013" "6/24/2013"
    end
    Code:
    gen sd=date(startdate, "MDY",2008)
    format sd %td
    gen ed=date(enddate, "MDY",2008)
    format ed %td
    Code:
    gen count=0
    local N=_N
    forvalue i=1/`N' {
        count if (sd<sd[`i'])&(ed>sd[`i'])&(id==id[`i'])
        replace count=r(N) in `i'
        }
    I dont know why you have to do it in Stata when Excel is more convenient this case.

    Comment


    • #3
      See https://www.stata-journal.com/sjpdf....iclenum=dm0068 for some technique. That reveals an otherwise unpredictable search term as dm0068 will find discussions in this forum.

      Comment


      • #4
        Thank you for your help, Leon.

        The reason why I want to do it in Stata is that the sample size is so huge like about 100,000. I just copied a very small part of the full sample. When I tried to do it in Excel, it took forever.
        So, I want to do it in Stata.

        Comment


        • #5
          Thank you, Nick.
          That article is very useful !!

          Comment

          Working...
          X