Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • how to tabulate a set of data, by one variable and its frequency? thanks

    Hello everyone,

    I am working on a research project and have problem to tabulate my data. I need your help.




    Here is the pilot test sample.



    Company Name GM PM F2005

    A 1 2 7

    B 1 2 6

    C 2 2 6

    D 3 3 7



    I want to show my results in this way:



    F2005 Number of companies Average of GM Average of PM

    7 2 2 2.5

    6 2 1.5 2



    I tried to use:

    Tabstat GM PM, by (F2005).

    But I could get all the other information on the above table except the number of Firms ( or the Frequency of F2005)



    Thanks very much.



  • #2
    the sample results is
    F_2006 | GM2006 PM2006 ROA2006 RG2006
    ---------+----------------------------------------
    0 | 18 18 18 18
    | -101.9927 -134.0311 -.7365348 -.5183173
    ---------+----------------------------------------
    1 | 68 68 68 68
    | -9.865715 -16.9031 -.4244573 -.1469809
    ---------+----------------------------------------
    2 | 130 130 130 130
    | .1076478 -.9472313 -.1657835 .1018003
    ---------+----------------------------------------
    3 | 145 145 145 145
    | -.1798726 -.8081128 -.065857 .4555298
    ---------+----------------------------------------
    4 | 610 610 610 610
    | .2907559 -.1440296 .0179201 .2546547
    ---------+----------------------------------------
    5 | 400 400 400 400
    | .4856186 .0600523 .0471835 .2564475
    ---------+----------------------------------------
    6 | 416 416 416 416
    | .4227893 .1343961 .1805606 .173869
    ---------+----------------------------------------
    7 | 492 492 492 492
    | .3887568 .127154 .109578 .4223786
    ---------+----------------------------------------
    Total | 2279 2279 2279 2279
    | -.7810739 -1.64443 .037564 .2624044

    based on tabstat GM2006 PM2006 ROA2006 RG2006 ,by(F_2006) statistics(n mean)
    since all the number in each row is the same, I want to add one column, to show the number of each row.

    Comment


    • #3
      The -table- command will do you what you want. The only wrinkle is that you have to provide another variable that it can "count" for the number of observations.

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input str1 Company_Name float(GM PM F2005)
      "A" 1 2 7
      "B" 1 2 6
      "C" 2 2 6
      "D" 3 3 7
      end
      
      gen long obs_no = _N
      
      table F2005, c(count obs_no mean GM mean PM) format(%2.1f)
      Before your next post here, please read the FAQ, especially #12, for directions on how to show data examples (the -dataex- command) and code or Stata output (code delimiters). Notice that I have posted your data here using -dataex-. To recreate that data example, all you have to do is copy that into the do editor and run it. The way in which you posted your data example was difficult to import into Stata: that part took me much longer than figuring out your solution, coding it, and testing the code. The output that you post in #2 is barely readable because it aligns poorly. Had it been posted between code delimiters (as I have done here) it would align neatly and be very easy for everyone to read. Thank you.

      Comment


      • #4
        Originally posted by Clyde Schechter View Post
        The -table- command will do you what you want. The only wrinkle is that you have to provide another variable that it can "count" for the number of observations.

        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input str1 Company_Name float(GM PM F2005)
        "A" 1 2 7
        "B" 1 2 6
        "C" 2 2 6
        "D" 3 3 7
        end
        
        gen long obs_no = _N
        
        table F2005, c(count obs_no mean GM mean PM) format(%2.1f)
        Before your next post here, please read the FAQ, especially #12, for directions on how to show data examples (the -dataex- command) and code or Stata output (code delimiters). Notice that I have posted your data here using -dataex-. To recreate that data example, all you have to do is copy that into the do editor and run it. The way in which you posted your data example was difficult to import into Stata: that part took me much longer than figuring out your solution, coding it, and testing the code. The output that you post in #2 is barely readable because it aligns poorly. Had it been posted between code delimiters (as I have done here) it would align neatly and be very easy for everyone to read. Thank you.
        Thanks very much, Clyde. this is my first time here, I will read the FAQ first.

        Have a wonderful day.

        Tonni

        Comment


        • #5
          Originally posted by Clyde Schechter View Post
          The -table- command will do you what you want. The only wrinkle is that you have to provide another variable that it can "count" for the number of observations.

          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input str1 Company_Name float(GM PM F2005)
          "A" 1 2 7
          "B" 1 2 6
          "C" 2 2 6
          "D" 3 3 7
          end
          
          
          gen long obs_no = _N
          
          table F2005, c(count obs_no mean GM mean PM) format(%2.1f)
          Before your next post here, please read the FAQ, especially #12, for directions on how to show data examples (the -dataex- command) and code or Stata output (code delimiters). Notice that I have posted your data here using -dataex-. To recreate that data example, all you have to do is copy that into the do editor and run it. The way in which you posted your data example was difficult to import into Stata: that part took me much longer than figuring out your solution, coding it, and testing the code. The output that you post in #2 is barely readable because it aligns poorly. Had it been posted between code delimiters (as I have done here) it would align neatly and be very easy for everyone to read. Thank you.

          Hi Clyde.
          I still get some problems here. I need your help. thanks again!

          Problem 1: when I try to put seven variables using your method, Stata shows "too many stats"
          Problem 2: I want to add one row at the bottom of the tabulated table, and show the total number of all the samples in the "N" (frequency) column, and the mean value of each variable.

          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input float(F_2005 GM2005 PM2005 ROA2005 RG2005 deltaGM2005 deltaPM2005 deltaROA2005)
          7  .4223719  .1035839  .13868746 .06405103    .02458623  .011790313  .03209211
          7  .4882729  .0970657   .0751159  .0357556     .0146164   .05626224  .04768959
          7  .4162511  .2146647  .06826299  .2962829    .04287511    .0776531   .0241012
          7  .4517104 .01225262 .010102886 .04624564   .000719577   .03978623 .031179626
          6 .24383886 .04037718  .10297008  .1660827 -.0014713854 .0003058575 .005100206
          7 .25617433 .09313928  .10886706 .26609364   .023646235    .0797439   .0933786
          7  .9377011 .06634097  .01920784 .09537964     .1468569   .05499785 .016470917
          7    .68093 .07946688  .06451876 .07128288    .01855743   .07764187  .06299094
          7  .6470634 .07586633  .11563747  .2473369   .015321076  .010145523  .01003854
          7  .1876313 .03978259  .03572454 .08145221   .002597973   .01918491 .018459603
          end
          Last edited by Tonni Xia; 25 Aug 2017, 07:25.

          Comment


          • #6
            I wonder whether you wish something like this:

            Code:
            . foreach var of varlist GM2005-deltaROA2005 {
              2. by F_2005, sort : egen float mean`var' = mean(`var')
              3. by F_2005, sort : egen float myn`var' = count(GM2005)
              4. }
            Best regards,

            Marcos

            Comment


            • #7
              Originally posted by Marcos Almeida View Post
              I wonder whether you wish something like this:

              Code:
              . foreach var of varlist GM2005-deltaROA2005 {
              2. by F_2005, sort : egen float mean`var' = mean(`var')
              3. by F_2005, sort : egen float myn`var' = count(GM2005)
              4. }
              Hi Marcos,
              Thanks very much! appreciated it.

              Can you elaborate it a little bit? I am still confused about it. I followed your instructions, but Stata showed "invalid syntax"

              Thanks again

              Best,

              Tonni

              Comment


              • #8
                -table- will only allow you to display four statistics. There are a number of user-written programs out there that do various kinds of tables and perhaps one of them is suitable. The problem is there are so many that I find it difficult to keep track of which one does what, so I don't really know much about any of them.

                You can get very close to what you want without too much difficulty using basic Stata commands as follows:

                Code:
                * Example generated by -dataex-. To install: ssc install dataex
                clear
                input float(F_2005 GM2005 PM2005 ROA2005 RG2005 deltaGM2005 deltaPM2005 deltaROA2005)
                7  .4223719  .1035839  .13868746 .06405103    .02458623  .011790313  .03209211
                7  .4882729  .0970657   .0751159  .0357556     .0146164   .05626224  .04768959
                7  .4162511  .2146647  .06826299  .2962829    .04287511    .0776531   .0241012
                7  .4517104 .01225262 .010102886 .04624564   .000719577   .03978623 .031179626
                6 .24383886 .04037718  .10297008  .1660827 -.0014713854 .0003058575 .005100206
                7 .25617433 .09313928  .10886706 .26609364   .023646235    .0797439   .0933786
                7  .9377011 .06634097  .01920784 .09537964     .1468569   .05499785 .016470917
                7    .68093 .07946688  .06451876 .07128288    .01855743   .07764187  .06299094
                7  .6470634 .07586633  .11563747  .2473369   .015321076  .010145523  .01003854
                7  .1876313 .03978259  .03572454 .08145221   .002597973   .01918491 .018459603
                end
                
                
                preserve
                gen long obs_no = _n
                collapse (count) n_companies = obs_no (mean) GM2005-deltaROA2005, by(F_2005)
                
                list, noobs clean sum(n_companies) mean(GM2005-deltaROA2005)
                restore
                The Sum and Mean lines at the end are separate from each other, but it definitely has the information you want.

                Comment


                • #9
                  Originally posted by Clyde Schechter View Post
                  -table- will only allow you to display four statistics. There are a number of user-written programs out there that do various kinds of tables and perhaps one of them is suitable. The problem is there are so many that I find it difficult to keep track of which one does what, so I don't really know much about any of them.

                  You can get very close to what you want without too much difficulty using basic Stata commands as follows:

                  Code:
                  * Example generated by -dataex-. To install: ssc install dataex
                  clear
                  input float(F_2005 GM2005 PM2005 ROA2005 RG2005 deltaGM2005 deltaPM2005 deltaROA2005)
                  7 .4223719 .1035839 .13868746 .06405103 .02458623 .011790313 .03209211
                  7 .4882729 .0970657 .0751159 .0357556 .0146164 .05626224 .04768959
                  7 .4162511 .2146647 .06826299 .2962829 .04287511 .0776531 .0241012
                  7 .4517104 .01225262 .010102886 .04624564 .000719577 .03978623 .031179626
                  6 .24383886 .04037718 .10297008 .1660827 -.0014713854 .0003058575 .005100206
                  7 .25617433 .09313928 .10886706 .26609364 .023646235 .0797439 .0933786
                  7 .9377011 .06634097 .01920784 .09537964 .1468569 .05499785 .016470917
                  7 .68093 .07946688 .06451876 .07128288 .01855743 .07764187 .06299094
                  7 .6470634 .07586633 .11563747 .2473369 .015321076 .010145523 .01003854
                  7 .1876313 .03978259 .03572454 .08145221 .002597973 .01918491 .018459603
                  end
                  
                  
                  preserve
                  gen long obs_no = _n
                  collapse (count) n_companies = obs_no (mean) GM2005-deltaROA2005, by(F_2005)
                  
                  list, noobs clean sum(n_companies) mean(GM2005-deltaROA2005)
                  restore
                  The Sum and Mean lines at the end are separate from each other, but it definitely has the information you want.
                  Thanks very much , Clyde!! Greatly appreciated it.

                  I've already get what I want, expect that the table looks a little bit ugly.

                  Have a beautiful day.

                  Best,

                  Tonni

                  Comment


                  • #10
                    Hi Marcos,
                    Thanks very much! appreciated it.

                    Can you elaborate it a little bit? I am still confused about it. I followed your instructions, but Stata showed "invalid syntax"
                    Hello Tony,

                    I just copied and pasted your toy example, hence you may copy and paste mine and it shall work accordingly.

                    Shall you wish to tyype, please make sure you typed verbatim, for example, " ` " and " ' " are rather tricky to tell apart, at first glance.

                    Edited: by the way, please make sure you didn't copy and paste "2.", "3," and "4.", for Stata will do that for you.
                    Last edited by Marcos Almeida; 25 Aug 2017, 14:30.
                    Best regards,

                    Marcos

                    Comment


                    • #11
                      Originally posted by Clyde Schechter View Post
                      -table- will only allow you to display four statistics. There are a number of user-written programs out there that do various kinds of tables and perhaps one of them is suitable. The problem is there are so many that I find it difficult to keep track of which one does what, so I don't really know much about any of them.

                      You can get very close to what you want without too much difficulty using basic Stata commands as follows:

                      Code:
                      * Example generated by -dataex-. To install: ssc install dataex
                      clear
                      input float(F_2005 GM2005 PM2005 ROA2005 RG2005 deltaGM2005 deltaPM2005 deltaROA2005)
                      7 .4223719 .1035839 .13868746 .06405103 .02458623 .011790313 .03209211
                      7 .4882729 .0970657 .0751159 .0357556 .0146164 .05626224 .04768959
                      7 .4162511 .2146647 .06826299 .2962829 .04287511 .0776531 .0241012
                      7 .4517104 .01225262 .010102886 .04624564 .000719577 .03978623 .031179626
                      6 .24383886 .04037718 .10297008 .1660827 -.0014713854 .0003058575 .005100206
                      7 .25617433 .09313928 .10886706 .26609364 .023646235 .0797439 .0933786
                      7 .9377011 .06634097 .01920784 .09537964 .1468569 .05499785 .016470917
                      7 .68093 .07946688 .06451876 .07128288 .01855743 .07764187 .06299094
                      7 .6470634 .07586633 .11563747 .2473369 .015321076 .010145523 .01003854
                      7 .1876313 .03978259 .03572454 .08145221 .002597973 .01918491 .018459603
                      end
                      
                      
                      preserve
                      gen long obs_no = _n
                      collapse (count) n_companies = obs_no (mean) GM2005-deltaROA2005, by(F_2005)
                      
                      list, noobs clean sum(n_companies) mean(GM2005-deltaROA2005)
                      restore
                      The Sum and Mean lines at the end are separate from each other, but it definitely has the information you want.
                      Hi Clyde,
                      I got a new problem related this topic here. last time I tried to tabulate F_2005 and other variables. all stuff is in one year. But now I have another similar columns in year 2006.

                      my question is, I want to show the table that has such functions.
                      no matter F_2005 or F_2006, if they both have the same number, calculate the mean value of all other variables.

                      in other words, last time I wanted to tabulate the result within one year data, but this time I want to tabulate them in two years data.

                      here is some sample data.
                      Thanks very very much again

                      Tonni

                      Code:
                      * Example generated by -dataex-. To install: ssc install dataex
                      clear
                      input float(GM2006 PM2006 ROA2006 dGM2006 dPM2006 dROA2006 F_2006 GM2005 PM2005 ROA2005 dGM2005 dPM2005 dROA2005 F_2005)
                      .42764595  .10512872   .1513479   .005274028   .001544811    .01266046 7  .4223719   .1035839  .13868746    .02458623  .011790313    .03209211 7
                       .5179225  .13461168   .0951246   .029649585    .03754599   .020008713 7  .4882729   .0970657   .0751159     .0146164   .05626224    .04768959 7
                       .5505673  .34846795     .15959    .13431627    .13380326      .091327 7  .4162511   .2146647  .06826299    .04287511    .0776531     .0241012 7
                       .2442427  .04337969  .10244846  .0004038215   .003002517 -.0005216226 6 .24383886  .04037718  .10297008 -.0014713854 .0003058575   .005100206 6
                       .9381364   .0810686  .02543153  .0004352927   .014727622   .006223688 7  .9377011  .06634097  .01920784     .1468569   .05499785   .016470917 7
                       .4878532  .04240376  .04266287   .036142796   .030151136   .032559983 7  .4517104  .01225262 .010102886   .000719577   .03978623   .031179626 7
                       .6945034  .12584378  .10466795   .013573408     .0463769    .04014919 7    .68093  .07946688  .06451876    .01855743   .07764187    .06299094 7
                      .26810515  .09531061  .11710104   .011930823  .0021713302   .008233987 7 .25617433  .09313928  .10886706   .023646235    .0797439     .0933786 7
                      .18315934  .03899354  .05179102   .004796326   .003911104  -.005408023 6   .178363 .035082433  .05719904  .0011811256 .0003181659  .0045148693 7
                      .25874156 .030692413  .04562467   .001915276   .005441682   .006356943 7 .25682628  .02525073  .03926772   .003569305   .09407928     .1403097 7
                      .21361578  .04064085  .07047139   .003116205    .00652735    .01163359 7 .21049957 .034113497   .0588378   .003255129  .007642988   .013663903 7
                       .6483468   .0791054   .1173261  .0012834668  .0032390654  .0016886443 7  .6470634  .07586633  .11563747   .015321076  .010145523    .01003854 7
                       .2291368  .07402143  .13173556  -.007338285    .01212991   .030843005 6 .23647507  .06189152  .10089256    .04490945  .018484522   .029418506 7
                        .272165  .06640737  .06732167    .01146996   .006558526   .016073763 7 .26069504  .05984884  .05124791   .015118688  .009811733    .01000781 7
                       .4394805  .10997683  .14716731   .028430283    .04372971    .05431064 7  .4110502  .06624712  .09285667    .02731222   .02289649   .023175493 7
                        .850734  .13661098    .066227   -.01038456   .068538375     .0344602 6  .8611186  .06807261 .031766795    .05633384  .016603027   .008936852 7
                       .2928491  .14358795  .16492324    .05427231    .07903132    .08582815 7  .2385768  .06455663  .07909509    .02559304   .03580318    .04748476 7
                      .26453975   .1030941  .04398192    .04580356     .0275307    .01517788 7  .2187362   .0755634 .028804043 -.0009117872  .026116006   .011341844 6
                      .04842058 .009819632  .03813223  .0038890615  .0012903875   .002327703 7 .04453152 .008529245  .03580453 .00033332035  .010475475    .04415073 7
                       .2826601  .04803318 .068971545  -.004409671   .012766507   .020247165 6 .28706977  .03526667  .04872438    .01160115  .009205934   .016574226 7
                       .3455108  .09698156   .0750921    .00846222   .010543086   .008024551 7  .3370486  .08643848 .067067556  -.005596846   .01307463   .012303136 6
                      .11730388 .017254207  .04147106   .007686771  .0045187315    .00771952 7  .1096171 .012735476  .03375154   .010354906  .005741407    .01548555 6
                      .57622826  .34437135  .17103183     .1231471    .13115548    .09885478 7  .4530812  .21321587  .07217705    .17856854    .2221067    .07432034 7
                       .3584587 .066303805  .06735055    .07819462   -.13390084    -.1882487 5 .28026405  .20020464  .25559926   .062581316   .16574493    .19747338 7
                       .4973337 .010944987   .0363991   .011041254   .008232568    .02783055 7  .4862924 .002712418 .008568549  .0014169216  .002857846    .00900457 7
                       .4164328 .071403496  .05951921   .021962494   .000625737 -.0031202734 6  .3944703  .07077776  .06263948   -.00249511  .010036148   .015350826 6
                       .3681691  .09105712  .06312367  .0031074584   .002471626    .00737156 7  .3650616   .0885855  .05575211  .0031843185 .0007379279   .002466265 7
                      .27776712  .04930269  .03829532   .017648071   .003913354   .002887439 7 .26011905  .04538933 .035407882   .012068614  .028790567    .02322081 7
                      .20201075     .06323  .05033149   .014379442   .023447413    .01460695 6  .1876313  .03978259  .03572454   .002597973   .01918491   .018459603 7
                        .299287 .031778753  .05683815   .002704382 -.0008632466 -.0017444976 5 .29658264    .032642  .05858264    .01883432 .0010719001   .005911987 7
                       .6482201   .2901905   .1665884    .02017635    .05147049   .023926437 7  .6280437     .23872  .14266196    .04210818   .11357385   .022204153 7
                      .26834816  .07906239  .04675502    .02506687   .037036292    .02273082 7  .2432813   .0420261   .0240242  .0014488846 -.012441628  -.006890574 5
                       .2318032  .02992645  .01114785     -.592749  -.011959743  -.004084952 4  .8245522   .0418862 .015232803    .10966456   .04300979    .01559113 7
                      .19712304  .06055265  .03386478    .03227231  -.073177196   -.05776659 5 .16485073  .13372985  .09163138    .04978259   .04321771  -.008951694 5
                      .28709164 .028984766  .06513888 -.0021636784  .0016557947   .005086634 6 .28925532  .02732897  .06005224  .0019952357 .0008852072 -.0002425164 6
                      .28196806  .04972673  .06384952  -.005801648   .011067376    .02095928 6  .2877697  .03865935  .04289024  -.010281146 .0012454167   .009565227 6
                       .6806391  .29044408  .19116354    .14477515    .14271587     .0899191 7 .53586394   .1477282  .10124444     .3708187 -.003627807   -.02264562 5
                      .29634243   .0972459  .10023494   .037071675    .03447444   .036894172 7 .25927076 .062771454  .06334077   .006224334  .011116773    .01417428 7
                      .14721547  .04974679  .11807605    .01414621   .008793823   .016733378 7 .13306926  .04095297  .10134267   .007163182  .007739991    .01330573 7
                       .2819731  .08298635  .10753679    .05297916    .04117188    .05203638 7 .22899394  .04181447   .0555004    .00582628  .010245468   .018092234 7
                      end

                      Comment


                      • #12
                        Thank you for showing the expanded data. But I do not understand how you want the result to look. I don't think it can be explained in words. Please mock up a table and show it so I can see the layout you are looking for.

                        Comment


                        • #13
                          Originally posted by Marcos Almeida View Post

                          Hello Tony,

                          I just copied and pasted your toy example, hence you may copy and paste mine and it shall work accordingly.

                          Shall you wish to tyype, please make sure you typed verbatim, for example, " ` " and " ' " are rather tricky to tell apart, at first glance.

                          Edited: by the way, please make sure you didn't copy and paste "2.", "3," and "4.", for Stata will do that for you.
                          Hi Macros,
                          I see, thanks for very much for detail reply. I got these the average data. but how to tabulate in a table in this way. should I use collapse?

                          by the way, how to solve this problem in two years. I just posted this question under Mr Clyde's post. if you have some idea about this question, please share with me. greatly appreciated it.

                          Have a great weekend.

                          Best,
                          Tonni

                          Comment


                          • #14
                            Originally posted by Clyde Schechter View Post
                            Thank you for showing the expanded data. But I do not understand how you want the result to look. I don't think it can be explained in words. Please mock up a table and show it so I can see the layout you are looking for.
                            Hi Clyde,
                            thanks very much for your quick reply. I ALSO put it in the excel attachment to make it clear.

                            Best,
                            Tonni
                            Question 1 is like this. You've already help me solve this.
                            Table A
                            GM2005 PM2005 ROA2005 F_2005
                            2 2 2 6
                            3 1 1 7
                            4 2 1 6
                            4 4 4 7
                            Then I get the answer from you:
                            F_2005 Numbers Mean(GM2005) Mean(PM2005) Mean(ROA2005)
                            7 2 3.5 2.5 2.5
                            6 2 3 2 1.5
                            My new question is :
                            Table B
                            Mean(GM2005) Mean(PM2005) Mean(ROA2005) F_2005 Mean(PM2006) Mean(ROA2005) Mean(ROA) F_2006
                            2 2 2 6 2 2 2 6
                            3 1 1 7 3 1 1 7
                            4 2 1 6 4 2 1 6
                            4 4 4 7 4 4 4 7
                            I hope the get the results like this
                            F_2005/2006
                            (Probably I need a new variable like Fscore)
                            Numbers MeanGM=Mean(GM2005)+Mean(GM2006) MeanPM=Mean(PM2005)+Mean(PM2006) MeanPM=Mean(ROA2005)+Mean(ROA2006)
                            7 4 4.25 =1/2*((D21+D45)/2+(H21+H23))
                            6 4
                            Attached Files

                            Comment


                            • #15
                              What you're asking for isn't possible. In your example data, sometimes the F_2005 and F_2006 scores are different, whereas your proposed result contains a single value that is supposed to represent both variables. In order to do something like this, you have to create a new variable, and it must be created in a way that somehow resolves discrepancies between F_2005 and F_2006 when the two are different.

                              Comment

                              Working...
                              X