filtering df columns, systemIds + chararacter + numeric, sapply, grepl to filter result

https://stackoverflow.com/questions/21763606

11-10-2022
|

Вопрос

Buidling an example df for a question resulted in a second question. First Q2:

Q2: is there a more efficient way to generate a df of mixed data types? Here is my attempt:

 a<-seq(2218,2221,1)
 b<-rep(58,4)
 s<-rep(22,4)
 d<-sample((100:220),4)
 e<-letters[seq(1:4)]
 f<-gl(4,1,labels="F")
 g<-factor(rep("INSTRUMENT NOT CALIBRATED",4))
 i<-factor(rep("org / initials",4))
 t<-data.frame(a,b,s,d,e,f,g,i)
 colnames(t)<-c("bSystemId","cSystemId","lengthdecimal","heightquantity","desc","code","notes","createdBy"); head(t)
 sapply(t,class)

Q1: I'm filtering data frame fields but combining filter statements partially reverses the filtering:

The result of these two statements gives me the result I want:

 a<-head(t[sapply(t,is.numeric)]);a
 b<-a[,!grepl("SystemId",names(a))];b

Can these statements be combined to produce the same result? I've tried a few things but none of them work. Example,

 head(t[,!grepl("SystemId",names(t[sapply(t,is.numeric)]))])

Thanks for any comments.

Решение

You can do this (really, very minor change to your code):

t[sapply(t,is.numeric) & !grepl("SystemId",names(t))]

As for Q2, I don't have great suggestions. You could try using replicate to create a list of random stuff, and then mapply it with a list of as functions. For example (untested):

df <- as.data.frame(
  mapply(
    function(fun, col) fun(col), 
    list(as.character, as.numeric, as.factor, as.logical, as.numeric),
    replicate(5, sample(1:10), simplify=F),
    SIMPLIFY=F
  ),
  stringsAsFactors=F
)
names(df) <- paste0("V", 1:ncol(df))
sapply(df, class)
#          V1          V2          V3          V4          V5 
# "character"   "numeric"    "factor"   "logical"   "numeric"

Лицензировано под: CC-BY-SA с атрибуция

Не связан с StackOverflow