Question

I have a data file with this format:

Weight    Industry Type  
251,787   Kellogg  h  
253,9601  Kellogg  a  
256,0758  Kellogg  h  
....

I read the data and try to draw an histogram with this commands:

 ce <- read.table("file.txt", header = TRUE)

 we = ce[,1]
 in = ce[,2]
 ty = ce[,3]

hist(we)

But I get this error:

Error en hist.default(we) : 'x' must be numeric.

What do I need to do in order to draw histograms for my three variables ?

Was it helpful?

Solution

Because of the thousand separator, the data will have been read as 'non-numeric'. So you need to convert it:

 we <- gsub(",", "", we)   # remove comma
 we <- as.numeric(we)      # turn into numbers

and now you can do

 hist(we)

and other numeric operations.

OTHER TIPS

Note that you could as well plot directly from ce (after the comma removing) using the column name :

hist(ce$Weight)

(As opposed to using hist(ce[1]), which would lead to the same "must be numeric" error.)

This also works for a database query result.

Use the dec argument to set "," as the decimal point by adding:

 ce <- read.table("file.txt", header = TRUE, dec = ",")
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top