Domanda

Following is head of my data:

dput(head(trucksv[,c(1,5)]))
structure(list(Measur. = c(1L, 2L, 3L, 4L, 5L, 1L), Speed.Mean.Trucks = c(NA, 
NA, 9.5, 4.5, NA, NA)), .Names = c("Measur.", "Speed.Mean.Trucks"
), row.names = c(1L, 2L, 3L, 4L, 5L, 17L), class = "data.frame")

I want to find cumulative distribution of speeds by 'Measur.' for which I used following function:

f <- function(x) {
  hi <- hist(x)
  speedmph=round(hi$breaks*0.68,1)
  prob=c(0, round(cumsum(hi$counts)/sum(hi$counts),digits=2))
  cbind(speedmph, prob)
}

But when I try to apply it to my data R gives me following error:

tspdistu <- ddply(trucksv, 'Measur.', summarise, trucksspeedmph = f(Speed.Mean.Trucks)) 
Error in hist.default(x) : invalid number of 'breaks'
Called from: top level 
Browse[1]> 

I am not sure how to find correct number of bins. Please help. Thanks in advance.

È stato utile?

Soluzione

The NA's are throwing it off (i.e. it has nothing to do with the # of bins). Here's a slightly modified f() with both plotting disabled for hist (it's unlikely you want plots) and with handing a column subset that's all NA's

f <- function(x) {

  y <- x[!is.na(x)]

  if (length(y) > 0) {

    hi <- hist(x, plot=FALSE)

    speedmph <- round(hi$breaks*0.68,1)

    prob <- c(0, round(cumsum(hi$counts) / sum(hi$counts), digits=2))

    cbind(speedmph, prob)

  } else { # still need to return proper sized values 

    cbind(rep(NA, length(x)), rep(NA, length(x)))

  }

}
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top