Domanda

Since upgrading from R 3.0.3 to 3.1.0 I have had troubles with read.csv as something seems to have changed in the underlying behaviour of read.table.

More precisely, I have a lot of CSV files that were once written using numpy. In general, these CSV files contain nothing more than a few columns of real values, e.g.:

foo,bar,baz
 1.162372390042962556e+00, 2.578863142444774326e+00, 9.740731078696458098e+02
-1.162361054912456337e+00, 6.006949912541799108e-01, 9.740731078696458098e+02
 1.327779088525234963e+00, 2.448484270423362030e+00, 9.664414899055957449e+02

Up to R 3.0.3, everything worked just fine when reading these files. Now I get this:

> tmp <- read.csv("foo.csv")
> str(tmp)
'data.frame':   3 obs. of  3 variables:
 $ foo: Factor w/ 3 levels " 1.162372390042962556e+00",..: 1 3 2
 $ bar: Factor w/ 3 levels " 2.448484270423362030e+00",..: 2 3 1
 $ baz: Factor w/ 2 levels " 9.664414899055957449e+02",..: 2 2 1

Will I have to change all of my codebase? Or is this merely a bug in 3.1.0?

È stato utile?

Soluzione 2

The NEWS file explains a change to the default behaviour for unrepresentable decimal numbers:

type.convert() (and hence by default read.table()) returns a character vector or factor when representing a numeric input as a double would lose accuracy. Similarly for complex inputs.

If a file contains numeric data with unrepresentable numbers of decimal places that are intended to be read as numeric, specify colClasses in read.table() to be "numeric".

Your numbers have 18 decimal places, doubles can only accurately represent about 15.

Altri suggerimenti

This is not a bug and yes, you have to change your code.

From the CRAN website:

type.convert() (and hence by default read.table()) returns a character vector or factor when representing a numeric input as a double would lose accuracy. Similarly for complex inputs.

If a file contains numeric data with unrepresentable numbers of decimal places that are intended to be read as numeric, specify colClasses in read.table() to be "numeric".

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top