R: Creating lapply() type test cases
Question
I've been working on code to create a parallel lapply() type function that uses Amazon's Elastic Map Reduce engine as the 'grid' for processing (yes, it's a mapper with no reducer). After I get the code stable I'll abstract it as a foreach backend. But first I need to build tests to test the code I have.
What would be some good test cases for this function?
My canonical test case right now is the following:
myList <- NULL
set.seed(1)
for (i in 1:10){
a <- c(rnorm(999), NA)
myList[[i]] <- a
}
outputLocal <- lapply(myList, mean, na.rm=T)
outputEmr <- emrlapply(myList, mean, myCluster, na.rm=T)
all.equal(outputEmr, outputLocal)
This test case makes sure the optional argument na.rm=T
is passed properly to the remote machines. What are some other test cases that I could be using? I don't currently support simplify
or USE.NAMES
arguments, although I will in the future.
Solution
What happens if you pass emrlapply
- A list of character vectors
- An empty list
- A list that is only empty after all the
NA
values have been removed NULL
- A vector (
lapply
works with vectors) - A matrix
- A data.frame
- A list of lists
You also need a test to see if your function gracefully handles EMR not being available or having required packages missing.