Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Import csv file

    Hi - I imported the attached csv file (publicly available from 1270.0.55.001 - Australian Statistical Geography Standard (ASGS): Volume 1 - Main Structure and Greater Capital City Statistical Areas, July 2016 (abs.gov.au) using the following code

    Code:
    import delimited "CED_2016_AUST.csv", clear
    I then generated another variable from the first numeric variable


    Code:
    gen sa2= floor(sa1_maincode_2016/100)
    
    format sa1 sa2 %14.0f
    
    list sa1 sa2 in 1/5

    This gave the following strange results

    Code:
         +-------------------------+
         | sa1_ma~2016         sa2 |
         |-------------------------|
      1. | 11901135961   119011360 |
      2. | 11904137903   119041376 |
      3. | 11903137228   119031376 |
      4. | 11901136026   119011360 |
      5. | 11903136932   119031368 |
         +-------------------------+
    Can someone please explain what the error in my approach is? I'm using Stata Version 17.0.

    Peter.
    Attached Files

  • #2
    I get something different. Also using 17.
    Last edited by George Ford; 22 Mar 2023, 15:33.

    Comment


    • #3
      Code:
                  
          sa1_ma~2016    sa2    
                  
      1.    11901135961    119011359    
      2.    11904137903    119041379    
      3.    11903137228    119031372    
      4.    11901136026    119011360    
      5.    11903136932    119031369

      Comment


      • #4
        George - My understanding of the floor command is that the floor(11901135961/100) should be 119011359, not 119011360.

        Comment


        • #5
          i was thinking ceiling.

          When I run your code, it appears to be correct.

          Comment


          • #6
            you try reboot?

            Comment


            • #7
              Thanks George Ford I have rebooted, and installed updates. Also tried on another PC, with the same results. Very strange.

              Comment


              • #8
                Originally posted by Peter Baade View Post

                I then generated another variable from the first numeric variable


                Code:
                gen sa2= floor(sa1_maincode_2016/100)
                
                format sa1 sa2 %14.0f
                
                list sa1 sa2 in 1/5

                This gave the following strange results

                Code:
                +-------------------------+
                | sa1_ma~2016 sa2 |
                |-------------------------|
                1. | 11901135961 119011360 |
                2. | 11904137903 119041376 |
                3. | 11903137228 119031376 |
                4. | 11901136026 119011360 |
                5. | 11903136932 119031368 |
                +-------------------------+
                Can someone please explain what the error in my approach is? I'm using Stata Version 17.0.

                Peter.
                Under

                Code:
                help precision
                you have

                Floats can store up to 16,777,215 exactly. If you stored your data in pennies, that would
                correspond to $167,772.15.
                So that means even after the division, the values are too large to be held as floats, which is the default. You need

                Code:
                gen double sa2= floor(sa1_maincode_2016/100)

                Comment


                • #9
                  Andrew Musau Thanks for your guidance - that solved the issue, and something for me to keep in mind.

                  Comment


                  • #10
                    You can "set type" to switch the default from float to double.

                    Comment


                    • #11
                      Thanks George Ford . That explains why we had different results initially. I appreciate the advice.

                      Comment

                      Working...
                      X