Вопрос

Following is head of my data:

dput(head(trucksv[,c(1,5)]))
structure(list(Measur. = c(1L, 2L, 3L, 4L, 5L, 1L), Speed.Mean.Trucks = c(NA, 
NA, 9.5, 4.5, NA, NA)), .Names = c("Measur.", "Speed.Mean.Trucks"
), row.names = c(1L, 2L, 3L, 4L, 5L, 17L), class = "data.frame")

I want to find cumulative distribution of speeds by 'Measur.' for which I used following function:

f <- function(x) {
  hi <- hist(x)
  speedmph=round(hi$breaks*0.68,1)
  prob=c(0, round(cumsum(hi$counts)/sum(hi$counts),digits=2))
  cbind(speedmph, prob)
}

But when I try to apply it to my data R gives me following error:

tspdistu <- ddply(trucksv, 'Measur.', summarise, trucksspeedmph = f(Speed.Mean.Trucks)) 
Error in hist.default(x) : invalid number of 'breaks'
Called from: top level 
Browse[1]> 

I am not sure how to find correct number of bins. Please help. Thanks in advance.

Это было полезно?

Решение

The NA's are throwing it off (i.e. it has nothing to do with the # of bins). Here's a slightly modified f() with both plotting disabled for hist (it's unlikely you want plots) and with handing a column subset that's all NA's

f <- function(x) {

  y <- x[!is.na(x)]

  if (length(y) > 0) {

    hi <- hist(x, plot=FALSE)

    speedmph <- round(hi$breaks*0.68,1)

    prob <- c(0, round(cumsum(hi$counts) / sum(hi$counts), digits=2))

    cbind(speedmph, prob)

  } else { # still need to return proper sized values 

    cbind(rep(NA, length(x)), rep(NA, length(x)))

  }

}
Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top