Convert column V1 to a factor
and use the default factor
summary
method, which returns frequencies.
> summary(as.factor(test$V1))
INS01 INS02 INS03 INS04
2 1 2 1
質問
I have a dataframe --> "test"
> test
V1 V2
1 INS01 1
2 INS01 1
3 INS02 1
4 INS03 2
5 INS03 3
6 INS04 4
> class(test)
[1] "data.frame"
I wanted a count of "INS01", "INS02", "INS03", "INS04". I tried using "by" but it is not giving me the desired output.
> agg <- by(test, test$V1, function(x) length(x))
> agg
test$V1: INS01
[1] 2
------------------------------------------------------------
test$V1: INS02
[1] 2
------------------------------------------------------------
test$V1: INS03
[1] 2
------------------------------------------------------------
test$V1: INS04
[1] 2
I am stuck here. Any help is appreciated. thanks
解決 2
Convert column V1 to a factor
and use the default factor
summary
method, which returns frequencies.
> summary(as.factor(test$V1))
INS01 INS02 INS03 INS04
2 1 2 1
他のヒント
Use table()
Let's make the test data frame (and please give similar code in your next questions, see here)
zz <- textConnection("
V1 V2
1 INS01 1
2 INS01 1
3 INS02 1
4 INS03 2
5 INS03 3
6 INS04 4
")
Data <- read.table(zz)
And then:
> table(Data$V1)
INS01 INS02 INS03 INS04
2 1 2 1
Joris shares the way I would go about doing this, but I thought I would share why your answer is wrong:
Using length
on a data.frame
tells you how many columns there are in a data.frame
, not the number of resulting rows (which is what you're actually after).
Example:
x <- data.frame(matrix(1:100, ncol = 25))
length(x)
# [1] 25
If you want to use by
, use nrow
instead:
by(test, test$V1, function(x) nrow(x))
# test$V1: INS01
# [1] 2
# ---------------------------------------------------------------------------
# test$V1: INS02
# [1] 1
# ---------------------------------------------------------------------------
# test$V1: INS03
# [1] 2
# ---------------------------------------------------------------------------
# test$V1: INS04
# [1] 1