No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • Masoumeh Sanagou
    started a topic egen rowtotal VS. gen

    egen rowtotal VS. gen

    Hi Statalist,

    Could you please explain it to me why the following commands don't have the same results:

    egen A1=rowtotal(B C D E)
    format A1 %15.0gc
    gen B1=B +C+ D +E
    format B1 %15.0gc

    A B C D E A1 B1
    19357423 13487791 1468631 3562403 174477 18693300 18693302
    33187208 29843165 949 3341676 1418 33187206 33187208
    41899510 26769131 10932809 4114182 400853 42216972 42216976
    Kind regards,
    Last edited by Masoumeh Sanagou; 16 Dec 2018, 18:16.

  • William Lisowski
    Your problem is one of precision. You do not give us enough information to be able to reconstruct the problem exactly, but with some guesswork I've constructed the following example.
    * Example generated by -dataex-. To install: ssc install dataex
    input long(A B C D E)
    19357423 13487791  1468631 3562403 174477
    33187208 29843165      949 3341676   1418
    41899510 26769131 10932809 4114182 400853
    egen A1 = rowtotal(B C D E)
    gen  B1 = B + C + D + E
    egen long A2 = rowtotal(B C D E)
    gen  long B2 = B + C + D + E
    format A1-B2 %15.0fc
    list     A1 B1 A2 B2 , clean noobs
    describe A1 B1 A2 B2
    . list     A1 B1 A2 B2 , clean noobs
                A1           B1           A2           B2  
        18,693,300   18,693,302   18,693,302   18,693,302  
        33,187,206   33,187,208   33,187,208   33,187,208  
        42,216,972   42,216,976   42,216,975   42,216,975  
    . describe A1 B1 A2 B2
                  storage   display    value
    variable name   type    format     label      variable label
    A1              float   %15.0fc               
    B1              float   %15.0fc               
    A2              long    %15.0fc               
    B2              long    %15.0fc
    I have assumed your variables A-E were stored as long variables. (See the output of help data types for an explanation of what this means if you are unfamiliar with numeric data types.)

    Because you did not specify a data type for your variables A1 and B1, they were created as type float, which led (somehow) to loss of precision in the calculation of A1.

    By specifying a long data type for A2 and B2, there was no loss of precision.

    With that said, even the best descriptions of data are no substitute for an actual example of the data. Perhaps my answer isn't correct because perhaps your data are not stored as a data type long as I have had to assume..

    You will improve your future posts if you use the dataex command to show your example data. If you are running version 15.1 or a fully updated version 14.2, it is already part of your official Stata installation. If not, run ssc install dataex to get it. Either way, run help dataex and read the simple instructions for using it. dataex will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    When asking for help with code, always show example data, as you did, but also, when showing example data, always use dataex.

    And if you haven't yet done so, please eview the Statalist FAQ linked to from the top of the page, as well as from the Advice on Posting link on the page you used to create your post. Note especially sections 9-12 on how to best pose your question.

    The more you help others understand your problem, the more likely others are to be able to help you solve your problem.

    Leave a comment: