Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • egen (sum) is not summing up absolute figures but counts a number of observations

    Greetings All,
    I seem to have a problem in Stata that should be a pretty much straightforward to resolve, but nothing I have tried so far (including egen sum or collapse command), seems to work. I have a firm-level panel data for one country only. I am trying to calculate total employment over 5-year time periods. First, I use an egen sum command to sum up employment annually: egen yr_employment=sum(employment), by(year). However, instead of getting the figures on total annual employment, I get firm observations on employment counted. I have tried all possible solutions, and yet nothing worked yet. 'Employment' has a data storage format 'long'. I have read somewhere on Stata forum discussion that sometimes a 'long' format could cause issues as a variable seems to be wrongly treated as string, though it is a numeric. I have generated a new var with a float format, but this hasn't resolved the problem. And I am not sure if it's a 'long' data storage type that causes a problem in the first place. Any suggestions of how I could resolve this problem would be highly appreciated.

  • #2
    Please show a data example that backs up the report here. The option by(year) would add over firms, not over years.

    long is a variable or storage type, not a display format.

    This works for me:


    Code:
    . clear
    
    . set obs 10
    
    . gen year = cond(_n <= 5, 2010 + _n, 2010 + _n - 5)
    
    . gen firm = cond(_n <= 5, 1, 2)
    
    . gen long employment = 1e6 + _n
    
    . egen check = sum(employment), by(firm)
    
    . l
    
         +----------------------------------+
         | year   firm   employ~t     check |
         |----------------------------------|
      1. | 2011      1    1000001   5000015 |
      2. | 2012      1    1000002   5000015 |
      3. | 2013      1    1000003   5000015 |
      4. | 2014      1    1000004   5000015 |
      5. | 2015      1    1000005   5000015 |
         |----------------------------------|
      6. | 2011      2    1000006   5000040 |
      7. | 2012      2    1000007   5000040 |
      8. | 2013      2    1000008   5000040 |
      9. | 2014      2    1000009   5000040 |
     10. | 2015      2    1000010   5000040 |
         +----------------------------------+

    Note that the egen function sum() went undocumented in Stata 9. It still works, but total() has been the documented name since 2005.

    Comment


    • #3
      Dear Nick, many thanks for your prompt response which is very useful - all works fine. I am not sure what was an issue though (could be, perhaps, a glitch) as all has worked perfectly fine this morning with the use of both 'sum ' and 'total' and without generating a new 'storage type' variable to replace the one with a 'long' format. Thank you once again.

      Comment

      Working...
      X