Pergunta

I'm new to R and I'd like to use it to perform Feature Selection on a dataset I have. I've found the FSelector package. I had a look at the manual but I have some doubts.

data(iris)
weights <- relief(Species~., iris, neighbours.count = 5, sample.size = 20)
subset <- cutoff.k(weights, 2)
f <- as.simple.formula(subset, "Species")

This example calculate the importance of each variable using Relief method. The last line creates a formula as: "class = feature1 + feature2 + ... +featureN". Now, given the subset of selected feature (a char array), how can I create a new dataset, from iris, which contains only those variables (i.e. a matrix with 2 columns)?

Foi útil?

Solução

If I understand it correctly, you can just take a subset of iris using the results from cutoff.k, since that returns a vector with the names of the variables you want to keep:

newdata <-  iris[,cutoff.k(weights, 2)] 

Here the [] command is used to get a subset of iris, in this case only the columns with the names in the result from cutoff.k (rows/columns are indicated as follows: [rows,columns]) .

To get a matrix instead of a data.frame: as.matrix(iris[,cutoff.k(weights, 2)])

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top