R: Creating lapply() type test cases

https://stackoverflow.com/questions/3301078

26-09-2020
|

Question

I've been working on code to create a parallel lapply() type function that uses Amazon's Elastic Map Reduce engine as the 'grid' for processing (yes, it's a mapper with no reducer). After I get the code stable I'll abstract it as a foreach backend. But first I need to build tests to test the code I have.

What would be some good test cases for this function?

My canonical test case right now is the following:

myList <- NULL
set.seed(1)
for (i in 1:10){
  a <- c(rnorm(999), NA)
  myList[[i]] <- a
}
outputLocal <- lapply(myList, mean, na.rm=T)
outputEmr   <- emrlapply(myList, mean, myCluster, na.rm=T)
all.equal(outputEmr, outputLocal)

This test case makes sure the optional argument na.rm=T is passed properly to the remote machines. What are some other test cases that I could be using? I don't currently support simplify or USE.NAMES arguments, although I will in the future.

Solution

What happens if you pass emrlapply

A list of character vectors
An empty list
A list that is only empty after all the NA values have been removed
NULL
A vector (lapply works with vectors)
A matrix
A data.frame
A list of lists

You also need a test to see if your function gracefully handles EMR not being available or having required packages missing.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow