Question

I have two related questions -- I'm trying to learn R properly, so I'm doing some homework problems from an R course. They have us writing a function to return a vector of correlations:

example.function <- function(threshold = 0) {
  example.vector <- vector()
  example.vector <- sapply(1:30, function(i) {
    complete.record.count <- # ... counts the complete records in each of the 30 files.
    ## Cutting for space and to avoid giving away answers.
    ## a few lines get the complete records in each 
    ## file and count them. 
    if(complete.record.count > threshold) {
      new.correlation <- cor(complete.record$val1, complete.record$val2)
      print(new.correlation)
      example.vector <- c(new.correlation, example.vector)
    }  
  })
  # more null value handling#
  return(example.vector)
}

As the function runs it prints the correlation value to stdout. The values it prints are accurate to six decimal points. So I know I'm getting a good value for new.correlation. The vector that is returned doesn't include those values. Instead, it is whole numbers in sequence.

> tmp <- example.function()
> head(tmp)
[1] 2 3 4 5 6 7

I can't figure out why sapply is pushing integers into the vector? What am I missing here?

I actually don't understand the core structure, which is more or less:

some.vector <- vector()
some.vector <- sapply(range, function(i) {
  some.vector <- c(new.value,some.vector)
}

that seems awfully un-R-like in its redundancy. Tips?

Was it helpful?

Solution

If you use sapply you don't need to create the vector yourself and you don't need to grow it (sapply takes care of all that). You probably want something like this:

example.function <- function(threshold = 0) {
  example.vector <- sapply(1:30, function(i) {
    ## Cutting for space and to avoid giving away answers.
    ## a few lines get the complete records in each 
    ## file and count them. 
    if(complete.record.count > threshold) {
      new.correlation <- cor(complete.record$val1, complete.record$val2)
      }  else {
        new.correlation <- NA   
      }
    new.correlation #return value of anonymous function
  })
  # more null value handling#
  example.vector #return value of example.function
}

However, it is unclear how the index i factors into the anonymous function and the question is not reproducible ...

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top