Question

Hello, I am learning with R and at this moment I use "iris" data which is default part of R. In iris data, I want to apply "mean" function to part of data frame.

My question does not concern anything complicated and it's because I am still quite new to R.

The data I am using are:

library(datasets)
data(iris)

head(iris)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa

What I want to do with it is apply mean function only e.g. to "setosa" in Species and I want to calculate it just for "Sepal.Lenght". The way I did it is (I actually apply it to "virginica" Species here, but I mean it just as example):

virgin<- iris[101:150,1]
virgin

and then

mean(virgin)

It gives me the correct mean but I think this method is kind of simple and is probably not suited when you don't want to search through data.frame manually

So my questions is how to do the same via other functions like apply or others I do not know about.

You can also suggest some sources from where could I read more about it. It can be this page as well (I found only more advanced questions though). If you want of course.

Thank you.

Was it helpful?

Solution

Your question is really about how to subset a data frame.

Here is one way:

mean(iris$Sepal.Length[iris$Species=="virginica"])
[1] 6.588

You can rewrite this with less duplication by using the function with():

mean(with(iris, Sepal.Length[Species=="virginica"]))
[1] 6.588

And another way:

mean(with(iris, iris[Species=="virginica", "Sepal.Length"]))
[1] 6.588
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top