Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Need to Sumifs command in Stata to add a variable values with respect to other variable

    Hi dear Professors,
    I produce a Table in Excel with the help of the Sumifs formula but now I want to produce it in Stata. I am attaching an image to show the desired output table.
    Actually, I want to add all the values of a variable (invention_citation_application) with respect to year and industryid. And in the final column of the table consists of citations per firm (total citation divided by total firms).
    Example data and the output image is as follows;
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int(stockcode year invention_citation_application ipatent) byte industryid
      4 2010     1   3 27
      4 2011     2   0 27
      4 2012     0   4 27
      4 2013     0   0 27
      4 2014     0   0 27
      4 2015    11   0 27
      4 2016     1   0 27
     17 2003     .   0 37
     17 2004     .   0 37
     17 2005     .   0 37
     17 2006     .   0 37
     17 2007     .   0 37
     17 2008     .   0 37
     17 2009     .   0 37
     17 2010     .   0 37
     17 2011     .   0 37
     17 2012     .   0 37
     17 2013     .   0 37
     17 2014     .   0 37
     17 2015     .   0 37
     17 2016     .   0 37
     35 2003     .   0 39
     35 2004     7   1 39
     35 2005     .   4 39
     35 2006     .   0 39
     35 2007     .   0 39
     35 2008     .   0 39
     35 2009     .   0 39
     35 2010     .   0 39
     35 2011     .   0 39
     35 2012     .   0 39
     35 2013     .   0 39
     48 2003     .   0 13
     48 2004     .   0 13
     48 2005     .   0 13
     48 2006     .   0 13
     48 2007     .   0 13
     48 2008     .   0 13
     48 2009     .   0 13
     48 2010     .   0 13
     48 2011     .   0 13
     48 2012    28   0 13
     48 2013     .   0 13
     48 2014    13   0 13
     48 2015     .   0 13
     48 2016     .   0 13
    333 2013  5241 504 38
    333 2014 10217 742 38
    end
    Click image for larger version

Name:	Statalist immage.png
Views:	1
Size:	370.9 KB
ID:	1635095


  • #2
    Perhaps this will start you in a useful direction.
    Code:
    use patentdata, clear
    collapse (count) Firms=stockcode (sum) Total=invention_citation_application, by(year)
    generate Average = Total/Firms
    format %9.0fc Firms Total
    format %9.2fc Average
    list, noobs abbreviate(30)
    
    use patentdata, clear
    collapse (count) Firms=stockcode (sum) Total=invention_citation_application, by(industryid)
    generate Average = Total/Firms
    format %9.0fc Firms Total
    format %9.2fc Average
    list, noobs abbreviate(30)
    Code:
      +----------------------------------+
      | year   Firms    Total    Average |
      |----------------------------------|
      | 2003       3        0       0.00 |
      | 2004       3        7       2.33 |
      | 2005       3        0       0.00 |
      | 2006       3        0       0.00 |
      | 2007       3        0       0.00 |
      |----------------------------------|
      | 2008       3        0       0.00 |
      | 2009       3        0       0.00 |
      | 2010       4        1       0.25 |
      | 2011       4        2       0.50 |
      | 2012       4       28       7.00 |
      |----------------------------------|
      | 2013       5    5,241   1,048.20 |
      | 2014       4   10,230   2,557.50 |
      | 2015       3       11       3.67 |
      | 2016       3        1       0.33 |
      +----------------------------------+
    Code:
      +----------------------------------------+
      | industryid   Firms    Total    Average |
      |----------------------------------------|
      |         13      14       41       2.93 |
      |         27       7       15       2.14 |
      |         37      14        0       0.00 |
      |         38       2   15,458   7,729.00 |
      |         39      11        7       0.64 |
      +----------------------------------------+

    Comment


    • #3
      If you are using Stata 17, the newly enhanced version of the table command provides a better alternative.
      Code:
      use patentdata, clear
      replace invention_citation_application = 0 if invention_citation_application==.
      table (year) (var result), statistic(frequency) ///
                                 statistic(sum invention_citation_application) ///
                                 statistic(mean invention_citation_application) ///
                                 nformat(%9.2fc mean)
      table (industryid) (var result), statistic(frequency) ///
                                 statistic(sum invention_citation_application) ///
                                 statistic(mean invention_citation_application) ///
                                 nformat(%9.2fc mean)
      Code:
      -------------------------------------------------------
              |  Frequency    invention_citation_application 
              |                       Total              Mean
      --------+----------------------------------------------
      year    |                                              
        2003  |          3                0              0.00
        2004  |          3                7              2.33
        2005  |          3                0              0.00
        2006  |          3                0              0.00
        2007  |          3                0              0.00
        2008  |          3                0              0.00
        2009  |          3                0              0.00
        2010  |          4                1              0.25
        2011  |          4                2              0.50
        2012  |          4               28              7.00
        2013  |          5            5,241          1,048.20
        2014  |          4           10,230          2,557.50
        2015  |          3               11              3.67
        2016  |          3                1              0.33
        Total |         48           15,521            323.35
      -------------------------------------------------------
      
      ----------------------------------------------------------
                 |  Frequency    invention_citation_application 
                 |                       Total              Mean
      -----------+----------------------------------------------
      industryid |                                              
        13       |         14               41              2.93
        27       |          7               15              2.14
        37       |         14                0              0.00
        38       |          2           15,458          7,729.00
        39       |         11                7              0.64
        Total    |         48           15,521            323.35
      ----------------------------------------------------------

      Comment


      • #4
        Originally posted by William Lisowski View Post
        Perhaps this will start you in a useful direction.
        Code:
        use patentdata, clear
        collapse (count) Firms=stockcode (sum) Total=invention_citation_application, by(year)
        generate Average = Total/Firms
        format %9.0fc Firms Total
        format %9.2fc Average
        list, noobs abbreviate(30)
        
        use patentdata, clear
        collapse (count) Firms=stockcode (sum) Total=invention_citation_application, by(industryid)
        generate Average = Total/Firms
        format %9.0fc Firms Total
        format %9.2fc Average
        list, noobs abbreviate(30)
        Code:
        +----------------------------------+
        | year Firms Total Average |
        |----------------------------------|
        | 2003 3 0 0.00 |
        | 2004 3 7 2.33 |
        | 2005 3 0 0.00 |
        | 2006 3 0 0.00 |
        | 2007 3 0 0.00 |
        |----------------------------------|
        | 2008 3 0 0.00 |
        | 2009 3 0 0.00 |
        | 2010 4 1 0.25 |
        | 2011 4 2 0.50 |
        | 2012 4 28 7.00 |
        |----------------------------------|
        | 2013 5 5,241 1,048.20 |
        | 2014 4 10,230 2,557.50 |
        | 2015 3 11 3.67 |
        | 2016 3 1 0.33 |
        +----------------------------------+
        Code:
        +----------------------------------------+
        | industryid Firms Total Average |
        |----------------------------------------|
        | 13 14 41 2.93 |
        | 27 7 15 2.14 |
        | 37 14 0 0.00 |
        | 38 2 15,458 7,729.00 |
        | 39 11 7 0.64 |
        +----------------------------------------+
        Thank you so much, Professor, the codes work really well and you have solved a big problem. Before, we usually used to export the data to excel and then use Sumifs formulas which used to take hours and hours to generate these tables. Once again thank you so much.
        However, there is an issue and generates an incorrect sum of firms. The picture is attached and you can see that in industry 13 there is only one firm having id # 48 but the codes generate value 14 instead of 1. Actually, the code sums the observations having industry id 13. Professor if you can correct it, it will be more valuable.
        Click image for larger version

Name:	1 for statalist.png
Views:	1
Size:	22.9 KB
ID:	1635204
        Click image for larger version

Name:	2 statalist.png
Views:	1
Size:	10.6 KB
ID:	1635205
        Last edited by tauseef ali; 07 Nov 2021, 00:53.

        Comment


        • #5
          Code:
          use patentdata, clear
          egen onefirm = tag(stockcode year)
          collapse (sum) Firms=onefirm Total=invention_citation_application, by(year)
          generate Average = Total/Firms
          format %9.0fc Firms Total
          format %9.2fc Average
          list, noobs abbreviate(30)
          
          use patentdata, clear
          egen onefirm = tag(stockcode industryid)
          collapse (sum) Firms=onefirm Total=invention_citation_application, by(industryid)
          generate Average = Total/Firms
          format %9.0fc Firms Total
          format %9.2fc Average
          list, noobs abbreviate(30)
          Code:
            +----------------------------------+
            | year   Firms    Total    Average |
            |----------------------------------|
            | 2003       3        0       0.00 |
            | 2004       3        7       2.33 |
            | 2005       3        0       0.00 |
            | 2006       3        0       0.00 |
            | 2007       3        0       0.00 |
            |----------------------------------|
            | 2008       3        0       0.00 |
            | 2009       3        0       0.00 |
            | 2010       4        1       0.25 |
            | 2011       4        2       0.50 |
            | 2012       4       28       7.00 |
            |----------------------------------|
            | 2013       5    5,241   1,048.20 |
            | 2014       4   10,230   2,557.50 |
            | 2015       3       11       3.67 |
            | 2016       3        1       0.33 |
            +----------------------------------+
          
            +-----------------------------------------+
            | industryid   Firms    Total     Average |
            |-----------------------------------------|
            |         13       1       41       41.00 |
            |         27       1       15       15.00 |
            |         37       1        0        0.00 |
            |         38       1   15,458   15,458.00 |
            |         39       1        7        7.00 |
            +-----------------------------------------+

          Comment


          • #6
            Originally posted by William Lisowski View Post
            Code:
            use patentdata, clear
            egen onefirm = tag(stockcode year)
            collapse (sum) Firms=onefirm Total=invention_citation_application, by(year)
            generate Average = Total/Firms
            format %9.0fc Firms Total
            format %9.2fc Average
            list, noobs abbreviate(30)
            
            use patentdata, clear
            egen onefirm = tag(stockcode industryid)
            collapse (sum) Firms=onefirm Total=invention_citation_application, by(industryid)
            generate Average = Total/Firms
            format %9.0fc Firms Total
            format %9.2fc Average
            list, noobs abbreviate(30)
            Code:
            +----------------------------------+
            | year Firms Total Average |
            |----------------------------------|
            | 2003 3 0 0.00 |
            | 2004 3 7 2.33 |
            | 2005 3 0 0.00 |
            | 2006 3 0 0.00 |
            | 2007 3 0 0.00 |
            |----------------------------------|
            | 2008 3 0 0.00 |
            | 2009 3 0 0.00 |
            | 2010 4 1 0.25 |
            | 2011 4 2 0.50 |
            | 2012 4 28 7.00 |
            |----------------------------------|
            | 2013 5 5,241 1,048.20 |
            | 2014 4 10,230 2,557.50 |
            | 2015 3 11 3.67 |
            | 2016 3 1 0.33 |
            +----------------------------------+
            
            +-----------------------------------------+
            | industryid Firms Total Average |
            |-----------------------------------------|
            | 13 1 41 41.00 |
            | 27 1 15 15.00 |
            | 37 1 0 0.00 |
            | 38 1 15,458 15,458.00 |
            | 39 1 7 7.00 |
            +-----------------------------------------+
            Thank you so much Professor, the codes are perfect and works well. Thanks for your valuable time and sincere efforts.

            Comment

            Working...
            X