Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Strings loosing precision in STATA

    Hi all,

    I am working with large data set with more than 6000 obs. Currently, I am having difficulty in sorting out why string variable is loosing precision - the variable of my interest is string (v2) because it includes entries mixed with special characters and numbers.

    Issue 1: when i import "X_ar" file, variable v2 loose precision - I want to import "v2" variable as it is in in excel file X_ar
    Issue 2: when i export imported data set into excel, again it gives me output of variable v2 without precision - I want v2 variable (exported in excel file "X") same as it is in file X_ar. I also think if we are able to solve out issue 1, most probably issue 2 will be resolved automatically

    Please the excel and dofile attached.

    Note: I am using STATA 12
    Attached Files
    Last edited by Abbas Raza; 12 Jun 2016, 16:57.

  • #2
    I am not sure what you mean by precision but when I follow your commands I get the outcome below. The contents are exactly the same. What changes is how the observations are aligned. In excel you can left-align or right-align observation (or center them). There are ways to do that in Stata too, but importing from Excel doesn't transfer the observation alignment information to Stata. You can use spaces instead of alignment in Excel to recreate the same look in Stata through importing.

    Click image for larger version

Name:	Capture.PNG
Views:	1
Size:	23.4 KB
ID:	1344978

    Comment


    • #3
      Thanks Cyrus.


      I have attached the print screen of how data looks like in my STATA data editor.


      I also tried the command (mentioned below) to increase the size of string but it is not working. So may be the issue is I am not able to increase the size of string variable that is v2. Could you please check why I am not able to to increase the size of string variable? or guide me further if I am missing something?

      Click image for larger version

Name:	Capture.JPG
Views:	1
Size:	37.7 KB
ID:	1344980



      I change the size of string from 23 to 29 charecs.
      format v2 %29s

      Alignment is definitely not a concern.

      Thanks,
      Abbas
      Last edited by Abbas Raza; 12 Jun 2016, 18:29.

      Comment


      • #4
        Mine looks different. Yours look more like it is imported as numerical variable in scientific form, and then converted to strings. You can try this with your file:

        Code:
        import excel X_ar.xls, sheet("GVI") firstrow allstring clear
        export excel using test.xls, firstrow(variables) replace
        Click image for larger version

Name:	Capture2.PNG
Views:	1
Size:	12.9 KB
ID:	1344982

        Last edited by Cyrus Levy; 12 Jun 2016, 19:32.

        Comment


        • #5
          Thanks Cyrus for generous help. However, it's not working. Still having same issue. Can i assume there is bug in my STATA?

          FYI: you mistakenly typed comma after X_ar.xls in your code :-)

          Comment


          • #6
            this looks to me like a simple problem with your display format; see "h format"

            please read the FAQ so you can see why you should not be uploading excel (or any other binary) data files

            Comment


            • #7
              You are right about the comma, there shouldn't be a comma before the clear option, the one right after the file name is fine.

              Can you go to:

              File -> Import -> Excel Spreadsheet -> Browse yourfile

              and then take a picture and post it here? Something like this:

              Click image for larger version

Name:	Capture3.PNG
Views:	1
Size:	15.3 KB
ID:	1344987


              Attached Files

              Comment


              • #8
                Abbas Raza:

                Please do study the FAQ Advice all posters are asked to read before posting, especially http://www.statalist.org/forums/help#stata and incidentally http://www.statalist.org/forums/help#spelling

                Your .do file for example would have been better and more simply presented as CODE

                Code:
                foreach i in GVI {
                clear
                import excel using "X_ar.xls", sheet("`i'") firstrow allstring
                
                saveold "`i'.dta", replace
                }
                
                export excel using "X.xls", firstrow(variable) replace nolabel
                That would save people clicking on an extension, opening it on their machine and then comparing it with your problem.

                Note that

                Code:
                format v2 %29s
                just changes the display format and doesn't do anything you change the storage type of the variable.

                I agree with Cyrus: at some point your variable was temporarily changed to numeric and then back again.

                Comment


                • #9
                  Hi Cyrus,

                  Please find the picture attached - still facing the same problem. However, if I copy paste my data in editor it solve the problem completely. Could you explain any possible reason of this?

                  Thanks,
                  Abbas

                  Attached Files

                  Comment


                  • #10
                    Help from any other STATA experts on this matter is also appreciated.

                    Thanks,
                    Abbas

                    Comment


                    • #11
                      Stata is giving you a view of how it will treat the data. In Column B = variable v2, you have non-numeric characters such as "o" and "-" and in some cases there is more than one value. You do not want such data imported as numeric. So, select "Import all data as strings" (as advised by Cyrus in #4) and then destring inside Stata.

                      If your data are messed up, that is a separate issue which import cannot resolve. But none of us can advise on that. We don't know what is in column B and what would be good or bad data.

                      Please note the spelling Stata as already underlined to you in #8.

                      Comment

                      Working...
                      X