문제

I have a .csv file with data like this:

         RI    Na   Mg   Al    Si    K    Ca   Ba   Fe Type
1   1.51793 12.79 3.50 1.12 73.03 0.64  8.77 0.00 0.00  BWF
2   1.51643 12.16 3.52 1.35 72.89 0.57  8.53 0.00 0.00  VWF
3   1.51793 13.21 3.48 1.41 72.64 0.59  8.43 0.00 0.00  BWF
4   1.51299 14.40 1.74 1.54 74.55 0.00  7.59 0.00 0.00  TBL
5   1.53393 12.30 0.00 1.00 70.16 0.12 16.19 0.00 0.24 BWNF
6   1.51655 12.75 2.85 1.44 73.27 0.57  8.79 0.11 0.22 BWNF

I want to create histograms for the distribution of each of the columns. I've tried this:

data<-read.csv("glass.csv")
names<-(attributes(data)$names)
for(name in names)
{
    dev.new()
    hist(data$name)
}

But i keep getting this error: Error in hist.default(data$name) : 'x' must be numeric

I'm assuming that this error is because attributes(data)$names returns a set of strings, "RI" "Na" "Mg" "Al" "Si" "K" "Ca" "Ba" "Fe" "Type"

But I'm unable to convert them to the necessary format.

Any help is appreciated!

도움이 되었습니까?

해결책

You were close. I think you were also trying to get Type at the end.

data<-read.csv("glass.csv")
# names<-(attributes(data)$names)
names<-names(data)
classes<-sapply(data,class)

for(name in names[classes == 'numeric'])
{
    dev.new()
    hist(data[,name]) # subset with [] not $
}

You could also just loop through the columns directly:

for (column in data[class=='numeric']) {
    dev.new()
    hist(column)
}

But ggplot2 is designed for multiple plots. Try it like this:

library(ggplot2)
library(reshape2)
ggplot(melt(data),aes(x=value)) + geom_histogram() + facet_wrap(~variable)

다른 팁

Rather than drawing lots of histograms, a better solution is to draw one plot with histograms in panels.

For this, you'll need the reshape2 and ggplot2 packages.

library(reshape2)
library(ggplot2)

First, you'll need to convert your data from wide to long form.

long_data <- melt(data, id.vars = "Type", variable.name = "Element")

Then create a ggplot of the value argument (you can change the name of this by passing value.name = "whatever" in the call to melt above) with histograms in each panel, split by each element.

(histograms <- ggplot(long_data, aes(value)) +
  geom_histogram() +
  facet_wrap(~ Element)
)

hist(data$name) looks for a column named name, which isn't there. Use hist(data[,name]) instead.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top