Outlier Analysis for Ordinal Data

https://stackoverflow.com/questions/22540327

18-06-2023
|

Question

In short, I conducted a satisfaction survey in which surveyees are required to answer on a satisfaction scale from 1 to 7.

Here is an example of what the scatterplot (jittered) between two variables from the data set looks like (I am working on R):

enter image description here

https://drive.google.com/uc?export=download&id=0Bx2Sns2vaI9ycm1tV2pNSWUxQXc

Therefore, the data set I am looking into is formed by ordinal data on which I want to conduct an outlier analysis.

What would you suggest as the best outlier analysis approach for this type of data and how can that be implemented on R?

Thank you so much in advance,

Deuterium

Solution

your data looks something like this:

x = rep(1:7, c(3, 4,17, 21, 48, 118, 93)) 
y = c(
    rep(1:7,c(1,2,0,0,0,0,0)),
    rep(1:7,c(2,0,1,1,0,0,0)),
    rep(1:7,c(10,3,2,1,0,0,1)),
    rep(1:7,c(15,3,1,1,1,0,0)),
    rep(1:7,c(20,10,2,10,3,2,1)),
    rep(1:7,c(40,20,20,30,3,4,1)),
    rep(1:7,c(50,25,10,5,3,0,0))
)

the plot:

library(car)
sp(x,y, jitter = list(x=0.8, y=0.8), smoother=F, reg.line = F)

enter image description here

If you just want to know whether a given value is an outlier in your data (i.e. a univariate outlier analysis), you can use:

library(outliers)
grubbs.test(x)

or simply use boxplot which values are plotted as outliers:

boxplot(x, plot=F)$out

If you need multivariate outliers, you can use the mvoutlier package (see functions ?chisq.plot and ?pcout):

library(mvoutlier)
pcout(x=data.frame(x,y))

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow