Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Scientific notation

    I have an individual level dataset with around 11 million observations. For each individual, I have data on individual's occupation (occ), hours worked in occupation (hours) and weeks worked (weeks).
    I create a variable which measures labour supply as labour_supply = weeks * hours
    Next, I find the total labour supply by occupation using the command: collapse (sum) labour_supply, by(occ)
    However, in doing so, I lose precision as STATA shows the labour supply by occupation in scientific notation. Moreover, labour supply by occupation is shown with varying precision levels. For example, I get 1.41e+09 for occupation X and 5.02e+07 for occupation Y.
    Is there a way around to get the exact values for this variable?

  • #2
    Please show your dataex example

    Comment


    • #3
      If you have numbers on the order of 109 then you have possibly lost precision in your calculations by not using double or long as the storage type for labour_supply. Then, assigning an appropriate format will handle displaying the results in their full precision.
      Code:
      generate double labour_supply = weeks * hours
      format %12.0fc labour_supply
      collapse (sum) labour_supply, by(occ)

      Comment


      • #4
        Another solution to this is just divide by a billion. Then your results will be the exact same, just as in billions.

        So if your OLS coefficient is 5.8, a one unit increase in x means a 5.8 billion increase in the outcome variable when we've adjusted for other predictors. Then you don't gotta go through this problem.

        Comment

        Working...
        X