In R why is factorial(100) displayed differently to prod(1:100)?

Question 1

Your test with all.equal does not produce what you expect. all.equal can only compare two values. The third argument is positionally matched to tolerance, which gives the tolerance of the comparison operation. In your invocation to all.equal you give it a tolerance of 100! which definitely leads to the comparison being true for absurdly differing values:

> all.equal( 0, 1000000000, prod(as.double(1:100)) )
[1] TRUE

But even if you give it two arguments only, e.g.

all.equal( prod(1:100), factorial(100) )

it would still produce TRUE because the default tolerance is .Machine$double.eps ^ 0.5, e.g. the two operands have to match to about 8 digits which is definitely the case. On the other hand, if you set the tolerance to 0, then neither three possible combinations emerge equal from the comparison:

> all.equal( prod(1:100), factorial(100), tolerance=0.0 )
[1] "Mean relative difference: 1.986085e-14"
> all.equal( prod(1:100), prod( as.double(1:100) ), tolerance=0.0 )
[1] "Mean relative difference: 5.22654e-16"
> all.equal( prod(as.double(1:100)), factorial(100), tolerance=0.0 )
[1] "Mean relative difference: 2.038351e-14"

Also note that just because you've told R to print 200 significant numbers doesn't mean that they are all correct. Indeed, 1/2^53 has about 53 decimal digits but only the first 16 are considered meaningful.

This also makes your comparison to the "true" value flawed. Observe this. The ending digits in what R gives you for factorial(100) are:

...01203456

You subtract n from it, where n is the "true" value of 100! so it should have 24 zeroes at the end and hence the difference should also end with the same digits that factorial(100) does. But rather it ends with:

...58520576

This only shows that all those digits are non-significant and one should not really look into their value.

It takes 525 bits of binary precision in order to exactly represent 100! - that's 10x the precision of double.

Question 2

This has to do not with the maximum value for a double but with its precision.

100! has 158 significant (decimal) digits. IEEE doubles (64 bit) have 52 bits of storage space for the mantissa, so you get rounding errors after about 16 decimal digits of precision have been exceeded.

Incidentally, 100! is in fact, as you suspected,

93326215443944152681699238856266700490715968264381621468592963895217599993229915608941463976156518286253697920827223758251185210916864000000000000000000000000

so all of the values R calculated are incorrect.

Now I don't know R, but it seems that all.equal() converts all three of those values to floats before comparing, and so their differences are lost.

Question 3

I will add a third answer just to graphically describe the behaviour you are encountering. Essentially, the double precision for factorial calculation is sufficient up to 22!, then it starts diverging more and more from the real value.

Around the 50!, there is a further distinction between the two methods factorial(x) and prod(1:x), with the latter yielding, as you indicated, values more similar to the "real" factor.

Factorial calculation precision in R

Code attached:

# Precision of factorial calculation (very important for the Fisher's Exact Test)
library(gmp)
perfectprecision<-list()
singleprecision<-c()
doubleprecision<-c()
for (x in 1:100){
    perfectprecision[x][[1]]<-factorialZ(x)
    singleprecision<-c(singleprecision,factorial(x))
    doubleprecision<-c(doubleprecision,prod(1:x))
}


plot(0,col="white",xlim=c(1,100),ylim=c(0,log10(abs(doubleprecision[100]-singleprecision[100])+1)),
        ,ylab="Log10 Absolute Difference from Big Integer",xlab="x!")
for(x in 1:100) {
    points(x,log10(abs(perfectprecision[x][[1]]-singleprecision[x])+1),pch=16,col="blue")
    points(x,log10(abs(perfectprecision[x][[1]]-doubleprecision[x])+1),pch=20,col="red")
}
legend("topleft",col=c("blue","red"),legend=c("factorial(x)","prod(1:x)"),pch=c(16,20))

Question 4

Well, you can tell from the body of factorial that it calls gamma, which calls .Primitive("gamma"). What does .Primitive("gamma") look like? Like this.

For large inputs, .Primitive("gamma")'s behaviour is on line 198 of that code. It's calling

exp((y - 0.5) * log(y) - y + M_LN_SQRT_2PI +
            ((2*y == (int)2*y)? stirlerr(y) : lgammacor(y)));

which is just an approximation.

By the way, the article on Rmpfr uses factorial as its example. So if you're trying to solve the problem, "just use the Rmpfr library".