export delimited csv format

martin wall

Join Date: Jul 2014

Posts: 7
#1

export delimited csv format

18 Nov 2014, 20:55

I am exporting data into a csv file with the command "export delimited variables using test.csv"
For some reason some of the variables are appearing in the csv file with 14 decimal places.

An example is var1 which has a value (when I list it) of 529.2 in the 9th observation

I have rounded it to 1 decimal place using the round function. However in the .csv file it appears as 529.2000000000001
This doesn't happen again with this variable until the 16th and the the 33rd and some subsequent observations but it does occur with other variables. About every 3rd or 4th observation has at least one variable that is written in the .csv file with 13 '0's and a '1' trailing the 'true' value even though none display that way in stata or in the data window. The pattern seems random.

This doesn't effect how the data are read into the subsequent program but I would like to know how I can control the formats in the export file. All the variables are stored as double and displayed as %12.0g or %9.0f
Martin
Tags: None
ben earnhart

Join Date: May 2014

Posts: 1027
#2

18 Nov 2014, 21:01

Try storing them as int before exporting?
Comment

Nick Cox

Join Date: Mar 2014
Posts: 35467

19 Nov 2014, 03:04

However they are presented, all numbers are held within Stata as binary numbers; so decimals can only be held exactly if they have binary equivalents that can be held exactly within the storage type being used. That's a main message of numerous discussions of precision, for which

Code:

search precision

provides leading examples. It follows that round(whatever,0.1) will work as you want if and only if the numbers concerned round to integers or half-integers. Here is a demonstration:

Code:

. clear 

. set obs 100
obs was 0, now 100

. set seed 2803

. gen y = round(runiform(), 0.1)

. clonevar Y = y

. format y %21x

. groups y Y

  +----------------------------------------------+
  |                     y    Y   Freq.   Percent |
  |----------------------------------------------|
  | +0.0000000000000X-3ff    0       6      6.00 |
  | +1.99999a0000000X-004   .1      13     13.00 |
  | +1.99999a0000000X-003   .2       9      9.00 |
  | +1.3333340000000X-002   .3       9      9.00 |
  | +1.99999a0000000X-002   .4      13     13.00 |
  |----------------------------------------------|
  | +1.0000000000000X-001   .5      11     11.00 |
  | +1.3333340000000X-001   .6       8      8.00 |
  | +1.6666660000000X-001   .7       6      6.00 |
  | +1.99999a0000000X-001   .8       7      7.00 |
  | +1.cccccc0000000X-001   .9      15     15.00 |
  |----------------------------------------------|
  | +1.0000000000000X+000    1       3      3.00 |
  +----------------------------------------------+

Converting to strings using a %2.1f display format is another possibility.

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 29967
#4

19 Nov 2014, 08:22

While Nick is, as ever, right and crystal clear about the precision issue underlying Martin's problem, it doesn't get to the question Martin has raised, which is how to control the formatting of the results exported to the .csv file. -export delimited- does not have any options that allow the user to force it to honor the display formats in the data set. The only way I know to do this is to convert the variables to strings (which, in a situation like this, will require the -force- option of -tostring-) and do the export that way. See the following examples:

Code:

. clear* . set obs 3 obs was 0, now 3 . gen x = _n/3 . . list +----------+ | x | |----------| 1. | .3333333 | 2. | .6666667 | 3. | 1 | +----------+ . . export delimited using unformatted.csv, replace file unformatted.csv saved . . type unformatted.csv, showtabs x .33333334 .66666669 1 . . format x %3.2f . . export delimited using formatted.csv, replace file formatted.csv saved . . type formatted.csv, showtabs x .33333334 .66666669 1 . . tostring x, replace usedisplayformat force x was float now str4 x was forced to string; some loss of information . . export delimited using as_strings.csv, replace file as_strings.csv saved . . type as_strings.csv, showtabs x 0.33 0.67 1.00

Actually, a -format- option on -export delimited- and also on -export excel- is part of my (as yet unposted) wish list for Stata 14.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35467
#5

19 Nov 2014, 08:46

That's correct. As usual, changing a display format changes only how something is displayed and has precisely zero effect on the value stored.
Comment

Announcement