Pergunta

Instead of trying to remove outliers from a data set, I am trying to create a new data frame consisting only of the rows tha have outliers in them.

I was able to column-bind the averages and standard deviations of the different groups onto the end of the data set. Now, I have tried this code to produce a table of outlier data:

Outliers <- Sample[((Sample$x - Sample$Averages)/Sample$StDevs) > 2.00,]

This process runs, but produces an empty table for Outliers. I tested some individual values from the data to make sure outliers existed, and they do. If I specify a row, the above calculation indeed produces a Boolean argument. It is when I try to collect these outliers in a table that I have problems. I also tried initializing Outliers as a data.frame or data.table, but was unsuccessful here as well (probably just because I am new to R).

ex: When I run

((Sample$x[3] - Sample$Averages[3])/Sample$StDevs[3]) > 2

it returns TRUE. This is good. Why, then, do I get an empty table of outliers when I simply want to KEEP everything in Sample where this condition is true? I do not feel that this should be a difficult problem, but I cannot for the life of me get it to work.

Any suggestions? Thanks in advance!

Foi útil?

Solução

Sample[ 0, ] should get you an empty dataframe with no rows and the same column names.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top