Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Commands of counting

    Dear,
    I have a dataset, id is number of the specific firm, if it failed within the observed period, the failure equals 1, otherwise 0.
    I need to count in each year, how many firms fail not observations.

    I use this command:
    codebook id if failure==1 & year==2010
    codebook id if failure==0 & year==2010

    Are there any more convenient way to count? because actually I have 10-years data and also other variables as conditions, it costs too much time if I count year by year.

    Thank you in advance!

    Best
    Josh

    * Example generated by -dataex-. For more info, type help dataex clear id revenue year failure 1 0 2010 0 1 0 2012 0 2 0 2010 0 2 0 2011 0 2 0 2012 0 2 9000 2013 0 2 0 2014 0 2 0 2015 0 3 13000 2010 1 3 16500 2011 1 3 0 2012 1 3 0 2013 1 3 18100 2014 1 3 0 2015 1 4 15000 2010 1 4 39000 2011 1 5 0 2014 0 6 434573 2010 1 6 210080 2011 1 6 404003 2012 1 6 474938 2013 1 6 294145 2014 1 6 319562 2015 1 end

  • #2
    Sample of dataset:

    ----------------------- copy starting from the next line -----------------------
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(id year failure)
    1 2010 0
    1 2011 0
    1 2012 0
    1 2013 0
    1 2014 0
    1 2015 0
    2 2010 1
    2 2011 1
    2 2012 1
    2 2013 1
    2 2014 1
    3 2010 1
    3 2011 1
    3 2012 1
    3 2013 1
    3 2014 1
    3 2015 1
    4 2010 0
    4 2011 0
    4 2012 0
    4 2013 0
    4 2014 0
    4 2015 0
    end
    ------------------

    Comment


    • #3
      Code:
      tab year failure
      might help.

      Comment


      • #4
        Originally posted by Nick Cox View Post
        Code:
        tab year failure
        might help.
        Hi Nick,
        Thank you for your suggestion, but what I want is the number of firms that failed or not rather than the observations.
        By using "tab year failure", I still got the observations.

        Best
        Josh

        Comment


        • #5
          Code:
          * Example generated by -dataex-. For more info, type help dataex
          clear
          input float(id year failure)
          1 2010 0
          1 2011 0
          1 2012 0
          1 2013 0
          1 2014 0
          1 2015 0
          2 2010 1
          2 2011 1
          2 2012 1
          2 2013 1
          2 2014 1
          3 2010 1
          3 2011 1
          3 2012 1
          3 2013 1
          3 2014 1
          3 2015 1
          4 2010 0
          4 2011 0
          4 2012 0
          4 2013 0
          4 2014 0
          4 2015 0
          end
          
          tab year failure 
          
                   |        failure
                year |         0          1 |     Total
          -----------+----------------------+----------
                2010 |         2          2 |         4 
                2011 |         2          2 |         4 
                2012 |         2          2 |         4 
                2013 |         2          2 |         4 
                2014 |         2          2 |         4 
                2015 |         2          1 |         3 
          -----------+----------------------+----------
               Total |        12         11 |        23
          From the data I see that two firms failed in 2010. That is what the table says too. What is the correct number that you want?

          Comment


          • #6
            Well, in the example data you show, there is only one observation per id in any year, so the number of firms meeting any condition in a given year will be the same as the number of observations meeting that condition in that year. On the assumption that in your full data set, some firms may appear more than once in the same year, the following code will do what you want:
            Code:
            by year failure id, sort: gen byte wanted = cond(failure, _n == 1, 0)
            by year: replace wanted = sum(wanted)
            by year: replace wanted = wanted[_N]

            Comment


            • #7
              Originally posted by Nick Cox View Post
              Code:
              * Example generated by -dataex-. For more info, type help dataex
              clear
              input float(id year failure)
              1 2010 0
              1 2011 0
              1 2012 0
              1 2013 0
              1 2014 0
              1 2015 0
              2 2010 1
              2 2011 1
              2 2012 1
              2 2013 1
              2 2014 1
              3 2010 1
              3 2011 1
              3 2012 1
              3 2013 1
              3 2014 1
              3 2015 1
              4 2010 0
              4 2011 0
              4 2012 0
              4 2013 0
              4 2014 0
              4 2015 0
              end
              
              tab year failure
              
              | failure
              year | 0 1 | Total
              -----------+----------------------+----------
              2010 | 2 2 | 4
              2011 | 2 2 | 4
              2012 | 2 2 | 4
              2013 | 2 2 | 4
              2014 | 2 2 | 4
              2015 | 2 1 | 3
              -----------+----------------------+----------
              Total | 12 11 | 23
              From the data I see that two firms failed in 2010. That is what the table says too. What is the correct number that you want?

              Sorry Nick, I made a mistake.

              The sample of dataset should be like this:
              Code:
              * Example generated by -dataex-. For more info, type help dataex
              clear
              input float(id year failure birth)
              1 2010 0 2010
              1 2011 0 2010
              1 2012 0 2010
              1 2013 0 2010
              1 2014 0 2010
              1 2015 0 2010
              2 2010 1 2012
              2 2011 1 2012
              2 2012 1 2012
              2 2013 1 2012
              2 2014 1 2012
              3 2010 1 2010
              3 2011 1 2010
              3 2012 1 2010
              3 2013 1 2010
              3 2014 1 2010
              3 2015 1 2010
              4 2010 0 2011
              4 2011 0 2011
              4 2012 0 2011
              4 2013 0 2011
              4 2014 0 2011
              4 2015 0 2011
              end
              ------------------ copy up to and including the previous line ------------------

              The variable "birth" the establishment year of the firm, I need to count in each birth year, how many firms fail.

              what I used are:
              codebook id if failure==1 & birth==2010
              codebook id if failure==0 & birth==2010

              I'd like to know some more convenient commands.

              Thank you!

              Best
              Josh

              Comment


              • #8
                See #6. Just substitute birth for year in the code and you'll have what you're asking for.

                Comment


                • #9
                  Originally posted by Clyde Schechter View Post
                  See #6. Just substitute birth for year in the code and you'll have what you're asking for.
                  Thank you Clyde!
                  But finally, I need to make a crosstab where the first row is failure=1 or 0, the first column is the list of birth, how can I get this through your commands?
                  Like this:
                  Birth number of firms that failure=1 number of firms that failure=0
                  2010 1 1
                  2011 0 1
                  2012 1 0
                  Best
                  Josh
                  Last edited by Josh Wang; 03 Mar 2023, 06:06.

                  Comment


                  • #10
                    So the whole thing, reflecting what I said in #6, and #8 would be
                    Code:
                    by birth failure id, sort: gen byte wanted = cond(failure, _n == 1, 0)
                    by birth: replace wanted = sum(wanted)
                    by birth: replace wanted = wanted[_N]
                    
                    egen tag = tag(birth)
                    tab birth wanted if tag

                    Comment


                    • #11
                      Originally posted by Clyde Schechter View Post
                      So the whole thing, reflecting what I said in #6, and #8 would be
                      Code:
                      by birth failure id, sort: gen byte wanted = cond(failure, _n == 1, 0)
                      by birth: replace wanted = sum(wanted)
                      by birth: replace wanted = wanted[_N]
                      
                      egen tag = tag(birth)
                      tab birth wanted if tag
                      Thank you for your help!

                      Best
                      Josh

                      Comment

                      Working...
                      X