Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Set up a dummy due to the values of first three years under each ID

    I have a database including id, revenue and year between 2010 and 2015.
    I want to set up a dummy. If at least any one of the first three years of REVENUE values is greater than 0 under each id, set the dummy variable to 1, otherwise 0.
    I posted an example, and I filled in the dummy variable manually in order to help you understand what I mean. I have a large sample, so I want to know which Stata commands I can use to fulfil my intentions. Thank you in advance.
    ----------------------- copy starting from the next line -----------------------
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte id long revenue int year float dummy
    1      0 2010 0
    1      0 2012 0
    2      0 2010 0
    2      0 2011 0
    2      0 2012 0
    2   9000 2013 0
    2      0 2014 0
    2      0 2015 0
    3  13000 2010 1
    3  16500 2011 1
    3      0 2012 1
    3      0 2013 1
    3  18100 2014 1
    3      0 2015 1
    4  15000 2010 1
    4  39000 2011 1
    5      0 2014 0
    6 434573 2010 1
    6 210080 2011 1
    6 404003 2012 1
    6 474938 2013 1
    6 294145 2014 1
    6 319562 2015 1
    end
    ------------------ copy up to and including the previous line ------------------

    Listed 23 out of 23 observations

  • #2
    Code:
    bys id (year) : gen byte dummy2 = (max(revenue[1],revenue[2],revenue[3]) > 0)
    assert dummy2 == dummy
    Last edited by Hemanshu Kumar; 01 Sep 2022, 12:49.

    Comment


    • #3
      Originally posted by Hemanshu Kumar View Post
      Code:
      bys id (year) : gen byte dummy2 = (max(revenue[1],revenue[2],revenue[3]) > 0)
      assert dummy2 == dummy
      Thank you for your help!

      Comment


      • #4
        As you have only 3 periods, what Hemanshu proposes is the better way. If you have many periods and you do not want to refer to them explicitly, the following would work too:

        Code:
        . bysort id (year): gen ofinterest = _n<4
        
        . by id: egen dummy3 = max(revenue>0/ofinterest)
        
        . assert dummy3==dummy

        Comment


        • #5
          And we have here an interesting case confirming that the Stata manual is right to warn us against using explicit subscripting with -egen- functions. The following one step solution which should logically work, does not work:

          Code:
          . bysort id (year): egen dummy4 = max(revenue>0/_n<4)
          
          . assert dummy4 == dummy
          9 contradictions in 23 observations
          assertion is false
          r(9);

          Comment


          • #6
            Originally posted by Joro Kolev View Post
            And we have here an interesting case confirming that the Stata manual is right to warn us against using explicit subscripting with -egen- functions. The following one step solution which should logically work, does not work:

            Code:
            . bysort id (year): egen dummy4 = max(revenue>0/_n<4)
            
            . assert dummy4 == dummy
            9 contradictions in 23 observations
            assertion is false
            r(9);
            Thank you for your suggestion!

            Comment


            • #7
              Originally posted by Joro Kolev View Post
              And we have here an interesting case confirming that the Stata manual is right to warn us against using explicit subscripting with -egen- functions. The following one step solution which should logically work, does not work:

              Code:
              . bysort id (year): egen dummy4 = max(revenue>0/_n<4)
              
              . assert dummy4 == dummy
              9 contradictions in 23 observations
              assertion is false
              r(9);
              Actually I think this is only because of the order in which logical and arithmatic operators are processed. This works fine:
              Code:
              bysort id (year): egen dummy4 = max(revenue>0/(_n<4))
              assert dummy4 == dummy

              Comment

              Working...
              X