Вопрос

I loaded a data set called gob into R and tried the handy summary function. It is Note that the 3rd quartile is less than the mean. How can this be? Is it the size of my data or something else like that?

I already tried passing in a large value for the digits parameter (e.g. 10), and that does not resolve the issue.

> summary(gob, digits=10)

   customer_id         100101.D            100199.D            100201.D        
 Min.   :   1083   Min.   :0.0000000   Min.   :0.0000000   Min.   :0.0000000  
 1st Qu.: 965928   1st Qu.:0.0000000   1st Qu.:0.0000000   1st Qu.:0.0000000  
 Median :2448738   Median :0.0000000   Median :0.0000000   Median :0.0000000  
 Mean   :2660101   Mean   :0.0010027   Mean   :0.0013348   Mean   :0.0000878  
 3rd Qu.:4133368   3rd Qu.:0.0000000   3rd Qu.:0.0000000   3rd Qu.:0.0000000  
 Max.   :6538193   Max.   :1.0000000   Max.   :1.0000000   Max.   :0.7520278  

Note that for gob$100201.D the mean is 0.0000878 but the 3rd Qu. = 0.

Это было полезно?

Решение

It is not a bug, just your data contains lot of 0 values. For example, if I make x with twelve 0 and one 1, I get result that 3rd quartile is smaller than mean

 x<-c(0,0,0,0,0,0,0,0,0,0,0,0,1)
summary(x)

  Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
0.00000 0.00000 0.00000 0.07692 0.00000 1.00000 

Try to use table() on your column to see distribution of values

table(x)
 x
 0  1 
 12  1 

Другие советы

The 3rd quantile can be lower than the mean. It's not 75% of the highest value, but the value at 75% of the count of a vector when ordered from lowest to highest. In other words:

Vector <- c(0,0,0,0,0,0,0,1)
mean(Vector)
[1] 0.125
quantile(Vector, 0.75)
[1] 0

To find the 3rd quantile, R orders all the data from lowest to highest, then picks the value closest to 75% of the length of that vector. So basically:

3rdQuar = Vector[round(length(Vector)*0.75)]

(Note that if it lands between two whole numbers, R will actually average the two. But this is the basic idea)

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top