質問

I have a dataframe --> "test"

> test
     V1 V2
1 INS01  1
2 INS01  1
3 INS02  1
4 INS03  2
5 INS03  3
6 INS04  4
> class(test)
[1] "data.frame"

I wanted a count of "INS01", "INS02", "INS03", "INS04". I tried using "by" but it is not giving me the desired output.

> agg <- by(test, test$V1, function(x) length(x))
> agg
test$V1: INS01
[1] 2
------------------------------------------------------------ 
test$V1: INS02
[1] 2
------------------------------------------------------------ 
test$V1: INS03
[1] 2
------------------------------------------------------------ 
test$V1: INS04
[1] 2

I am stuck here. Any help is appreciated. thanks

役に立ちましたか?

解決 2

Convert column V1 to a factor and use the default factor summary method, which returns frequencies.

> summary(as.factor(test$V1))
INS01 INS02 INS03 INS04
    2     1     2     1

他のヒント

Use table()

Let's make the test data frame (and please give similar code in your next questions, see here)

zz <- textConnection("
V1 V2
1 INS01  1
2 INS01  1
3 INS02  1
4 INS03  2
5 INS03  3
6 INS04  4
")
Data <- read.table(zz)

And then:

> table(Data$V1)

INS01 INS02 INS03 INS04 
    2     1     2     1 

Joris shares the way I would go about doing this, but I thought I would share why your answer is wrong:

Using length on a data.frame tells you how many columns there are in a data.frame, not the number of resulting rows (which is what you're actually after).

Example:

x <- data.frame(matrix(1:100, ncol = 25))
length(x)
# [1] 25

If you want to use by, use nrow instead:

by(test, test$V1, function(x) nrow(x))
# test$V1: INS01
# [1] 2
# --------------------------------------------------------------------------- 
# test$V1: INS02
# [1] 1
# --------------------------------------------------------------------------- 
# test$V1: INS03
# [1] 2
# --------------------------------------------------------------------------- 
# test$V1: INS04
# [1] 1
ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top