Question

The issue of dropping unused factor levels when subsetting has come up before. Common solutions include using character vectors where possible by declaring

options(stringsAsFactors = FALSE)

Sometimes, though, ordered factors are necessary for plotting, in which case we can use convenience functions like droplevels to create a wrapper for subset:

subsetDrop <- function(...){droplevels(subset(...))}

I realize that subsetDrop mostly solves this problem, but there are some situations where subsetting via [ is more convenient (and less typing!).

My question is how much further, for the sake of convenience, can we push this to be the 'default' behavior of R by overriding [ for data frames to automatically drop factor levels. For instance, the Hmisc package contains dropUnusedLevels which overrides [.factor for subsetting a single factor (which is no longer necessary, since the default [.factor appears to have a drop argument for dropping unused levels). I'm looking for a similar solution that would allow me to subset data frames using [ but automatically dropping unused factor levels (and of course preserving order in the case of ordered factors).

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top