I'm looking for the fastest way to obtain the value of π, as a personal challenge. More specifically, I'm using ways that don't involve using #define constants like M_PI, or hard-coding the number in.

The program below tests the various ways I know of. The inline assembly version is, in theory, the fastest option, though clearly not portable. I've included it as a baseline to compare against the other versions. In my tests, with built-ins, the 4 * atan(1) version is fastest on GCC 4.2, because it auto-folds the atan(1) into a constant. With -fno-builtin specified, the atan2(0, -1) version is fastest.

Here's the main testing program (pitimes.c):

#include <math.h>
#include <stdio.h>
#include <time.h>

#define ITERS 10000000
#define TESTWITH(x) {                                                       \
    diff = 0.0;                                                             \
    time1 = clock();                                                        \
    for (i = 0; i < ITERS; ++i)                                             \
        diff += (x) - M_PI;                                                 \
    time2 = clock();                                                        \
    printf("%s\t=> %e, time => %f\n", #x, diff, diffclock(time2, time1));   \

static inline double
diffclock(clock_t time1, clock_t time0)
    return (double) (time1 - time0) / CLOCKS_PER_SEC;

    int i;
    clock_t time1, time2;
    double diff;

    /* Warmup. The atan2 case catches GCC's atan folding (which would
     * optimise the ``4 * atan(1) - M_PI'' to a no-op), if -fno-builtin
     * is not used. */
    TESTWITH(4 * atan(1))
    TESTWITH(4 * atan2(1, 1))

#if defined(__GNUC__) && (defined(__i386__) || defined(__amd64__))
    extern double fldpi();

    /* Actual tests start here. */
    TESTWITH(atan2(0, -1))
    TESTWITH(2 * asin(1))
    TESTWITH(4 * atan2(1, 1))
    TESTWITH(4 * atan(1))

    return 0;

And the inline assembly stuff (fldpi.c) that will only work for x86 and x64 systems:

    double pi;
    asm("fldpi" : "=t" (pi));
    return pi;

And a build script that builds all the configurations I'm testing (

gcc -O3 -Wall -c           -m32 -o fldpi-32.o fldpi.c
gcc -O3 -Wall -c           -m64 -o fldpi-64.o fldpi.c

gcc -O3 -Wall -ffast-math  -m32 -o pitimes1-32 pitimes.c fldpi-32.o
gcc -O3 -Wall              -m32 -o pitimes2-32 pitimes.c fldpi-32.o -lm
gcc -O3 -Wall -fno-builtin -m32 -o pitimes3-32 pitimes.c fldpi-32.o -lm
gcc -O3 -Wall -ffast-math  -m64 -o pitimes1-64 pitimes.c fldpi-64.o -lm
gcc -O3 -Wall              -m64 -o pitimes2-64 pitimes.c fldpi-64.o -lm
gcc -O3 -Wall -fno-builtin -m64 -o pitimes3-64 pitimes.c fldpi-64.o -lm

Apart from testing between various compiler flags (I've compared 32-bit against 64-bit too because the optimizations are different), I've also tried switching the order of the tests around. But still, the atan2(0, -1) version still comes out on top every time.

Was it helpful?


The Monte Carlo method, as mentioned, applies some great concepts but it is, clearly, not the fastest, not by a long shot, not by any reasonable measure. Also, it all depends on what kind of accuracy you are looking for. The fastest π I know of is the one with the digits hard coded. Looking at Pi and Pi[PDF], there are a lot of formulae.

Here is a method that converges quickly — about 14 digits per iteration. PiFast, the current fastest application, uses this formula with the FFT. I'll just write the formula, since the code is straightforward. This formula was almost found by Ramanujan and discovered by Chudnovsky. It is actually how he calculated several billion digits of the number — so it isn't a method to disregard. The formula will overflow quickly and, since we are dividing factorials, it would be advantageous then to delay such calculations to remove terms.

enter image description here

enter image description here


enter image description here

Below is the Brent–Salamin algorithm. Wikipedia mentions that when a and b are "close enough" then (a + b)² / 4t will be an approximation of π. I'm not sure what "close enough" means, but from my tests, one iteration got 2 digits, two got 7, and three had 15, of course this is with doubles, so it might have an error based on its representation and the true calculation could be more accurate.

let pi_2 iters =
    let rec loop_ a b t p i =
        if i = 0 then a,b,t,p
            let a_n = (a +. b) /. 2.0 
            and b_n = sqrt (a*.b)
            and p_n = 2.0 *. p in
            let t_n = t -. (p *. (a -. a_n) *. (a -. a_n)) in
            loop_ a_n b_n t_n p_n (i - 1)
    let a,b,t,p = loop_ (1.0) (1.0 /. (sqrt 2.0)) (1.0/.4.0) (1.0) iters in
    (a +. b) *. (a +. b) /. (4.0 *. t)

Lastly, how about some pi golf (800 digits)? 160 characters!

int a=10000,b,c=2800,d,e,f[2801],g;main(){for(;b-c;)f[b++]=a/5;for(;d=0,g=c*2;c-=14,printf("%.4d",e+d/a),e=d%a)for(b=c;d+=f[b]*a,f[b]=d%--g,d/=g--,--b;d*=b);}


I really like this program, because it approximates π by looking at its own area.

IOCCC 1988 : westley.c

#define _ -F<00||--F-OO--;
int F=00,OO=00;main(){F_OO();printf("%1.3f\n",4.*-F/OO/OO);}F_OO()

Here's a general description of a technique for calculating pi that I learnt in high school.

I only share this because I think it is simple enough that anyone can remember it, indefinitely, plus it teaches you the concept of "Monte-Carlo" methods -- which are statistical methods of arriving at answers that don't immediately appear to be deducible through random processes.

Draw a square, and inscribe a quadrant (one quarter of a semi-circle) inside that square (a quadrant with radius equal to the side of the square, so it fills as much of the square as possible)

Now throw a dart at the square, and record where it lands -- that is, choose a random point anywhere inside the square. Of course, it landed inside the square, but is it inside the semi-circle? Record this fact.

Repeat this process many times -- and you will find there is a ratio of the number of points inside the semi-circle versus the total number thrown, call this ratio x.

Since the area of the square is r times r, you can deduce that the area of the semi circle is x times r times r (that is, x times r squared). Hence x times 4 will give you pi.

This is not a quick method to use. But it's a nice example of a Monte Carlo method. And if you look around, you may find that many problems otherwise outside your computational skills can be solved by such methods.

In the interests of completeness, a C++ template version, which, for an optimised build, will compute an approximation of PI at compile time, and will inline to a single value.

#include <iostream>

template<int I>
struct sign
    enum {value = (I % 2) == 0 ? 1 : -1};

template<int I, int J>
struct pi_calc
    inline static double value ()
        return (pi_calc<I-1, J>::value () + pi_calc<I-1, J+1>::value ()) / 2.0;

template<int J>
struct pi_calc<0, J>
    inline static double value ()
        return (sign<J>::value * 4.0) / (2.0 * J + 1.0) + pi_calc<0, J-1>::value ();

struct pi_calc<0, 0>
    inline static double value ()
        return 4.0;

template<int I>
struct pi
    inline static double value ()
        return pi_calc<I, I>::value ();

int main ()
    std::cout.precision (12);

    const double pi_value = pi<10>::value ();

    std::cout << "pi ~ " << pi_value << std::endl;

    return 0;

Note for I > 10, optimised builds can be slow, likewise for non-optimised runs. For 12 iterations I believe there are around 80k calls to value() (in the absence of memoisation).

There's actually a whole book dedicated (amongst other things) to fast methods for the computation of \pi: 'Pi and the AGM', by Jonathan and Peter Borwein (available on Amazon).

I studied the AGM and related algorithms quite a bit: it's quite interesting (though sometimes non-trivial).

Note that to implement most modern algorithms to compute \pi, you will need a multiprecision arithmetic library (GMP is quite a good choice, though it's been a while since I last used it).

The time-complexity of the best algorithms is in O(M(n)log(n)), where M(n) is the time-complexity for the multiplication of two n-bit integers (M(n)=O(n log(n) log(log(n))) using FFT-based algorithms, which are usually needed when computing digits of \pi, and such an algorithm is implemented in GMP).

Note that even though the mathematics behind the algorithms might not be trivial, the algorithms themselves are usually a few lines of pseudo-code, and their implementation is usually very straightforward (if you chose not to write your own multiprecision arithmetic :-) ).

The following answers precisely how to do this in the fastest possible way -- with the least computing effort. Even if you don't like the answer, you have to admit that it is indeed the fastest way to get the value of PI.

The FASTEST way to get the value of Pi is:

1) chose your favourite programming language 2) load its Math library 3) and find that Pi is already defined there -- ready for use!

In case you don't have a Math library at hand..

The SECOND FASTEST way (more universal solution) is:

look up Pi on the Internet, e.g. here: (1 million digits .. what's your floating point precision? )

or here:

or here:

It's really fast to find the digits you need for whatever precision arithmetic you would like to use, and by defining a constant, you can make sure that you don't waste precious CPU time.

Not only is this a partly humorous answer, but in reality, if anybody would go ahead and compute the value of Pi in a real application .. that would be a pretty big waste of CPU time, wouldn't it? At least I don't see a real application for trying to re-compute this.

Dear Moderator: please note that the OP asked: "Fastest Way to get the value of PI"

The BBP formula allows you to compute the nth digit - in base 2 (or 16) - without having to even bother with the previous n-1 digits first :)

Instead of defining pi as a constant, I always use acos(-1).

Just came across this one that should be here for completeness:

calculate PI in Piet

It has the rather nice property that the precision can be improved making the program bigger.

Here's some insight into the language itself

If this article is true, then the algorithm that Bellard has created could be one of the speediest available. He has created pi to 2.7 TRILLION digits using a DESKTOP PC!

...and he has published his work here

Good work Bellard, You are a pioneer!

This is a "classic" method, very easy to implement. This implementation, in python (not so fast language) does it:

from math import pi
from time import time

precision = 10**6 # higher value -> higher precision
                  # lower  value -> higher speed

t = time()

calc = 0
for k in xrange(0, precision):
    calc += ((-1)**k) / (2*k+1.)
calc *= 4. # this is just a little optimization

t = time()-t

print "Calculated: %.40f" % calc
print "Costant pi: %.40f" % pi
print "Difference: %.40f" % abs(calc-pi)
print "Time elapsed: %s" % repr(t)

You can find more information here.

Anyway the fastest way to get a precise as-much-as-you-want value of pi in python is:

from gmpy import pi
print pi(3000) # the rule is the same as 
               # the precision on the previous code

here is the piece of source for the gmpy pi method, I don't think the code is as much useful as the comment in this case:

static char doc_pi[]="\
pi(n): returns pi with n bits of precision in an mpf object\n\

/* This function was originally from netlib, package bmp, by
 * Richard P. Brent. Paulo Cesar Pereira de Andrade converted
 * it to C and used it in his LISP interpreter.
 * Original comments:
 *   sets mp pi = 3.14159... to the available precision.
 *   uses the gauss-legendre algorithm.
 *   this method requires time o(ln(t)m(t)), so it is slower
 *   than mppi if m(t) = o(t**2), but would be faster for
 *   large t if a faster multiplication algorithm were used
 *   (see comments in mpmul).
 *   for a description of the method, see - multiple-precision
 *   zero-finding and the complexity of elementary function
 *   evaluation (by r. p. brent), in analytic computational
 *   complexity (edited by j. f. traub), academic press, 1976, 151-176.
 *   rounding options not implemented, no guard digits used.
static PyObject *
Pygmpy_pi(PyObject *self, PyObject *args)
    PympfObject *pi;
    int precision;
    mpf_t r_i2, r_i3, r_i4;
    mpf_t ix;

    ONE_ARG("pi", "i", &precision);
    if(!(pi = Pympf_new(precision))) {
        return NULL;

    mpf_set_si(pi->f, 1);

    mpf_set_ui(ix, 1);

    mpf_init2(r_i2, precision);

    mpf_init2(r_i3, precision);
    mpf_set_d(r_i3, 0.25);

    mpf_init2(r_i4, precision);
    mpf_set_d(r_i4, 0.5);
    mpf_sqrt(r_i4, r_i4);

    for (;;) {
        mpf_set(r_i2, pi->f);
        mpf_add(pi->f, pi->f, r_i4);
        mpf_div_ui(pi->f, pi->f, 2);
        mpf_mul(r_i4, r_i2, r_i4);
        mpf_sub(r_i2, pi->f, r_i2);
        mpf_mul(r_i2, r_i2, r_i2);
        mpf_mul(r_i2, r_i2, ix);
        mpf_sub(r_i3, r_i3, r_i2);
        mpf_sqrt(r_i4, r_i4);
        mpf_mul_ui(ix, ix, 2);
        /* Check for convergence */
        if (!(mpf_cmp_si(r_i2, 0) && 
              mpf_get_prec(r_i2) >= (unsigned)precision)) {
            mpf_mul(pi->f, pi->f, r_i4);
            mpf_div(pi->f, pi->f, r_i3);


    return (PyObject*)pi;

EDIT: I had some problem with cut and paste and identation, anyway you can find the source here.

If by fastest you mean fastest to type in the code, here's the golfscript solution:


Use the Machin-like formula

176 * arctan (1/57) + 28 * arctan (1/239) - 48 * arctan (1/682) + 96 * arctan(1/12943) 

[; \left( 176 \arctan \frac{1}{57} + 28 \arctan \frac{1}{239} - 48 \arctan \frac{1}{682} + 96 \arctan \frac{1}{12943}\right) ;], for you TeX the World people.

Implemented in Scheme, for instance:

(+ (- (+ (* 176 (atan (/ 1 57))) (* 28 (atan (/ 1 239)))) (* 48 (atan (/ 1 682)))) (* 96 (atan (/ 1 12943))))

If you are willing to use an approximation, 355 / 113 is good for 6 decimal digits, and has the added advantage of being usable with integer expressions. That's not as important these days, as "floating point math co-processor" ceased to have any meaning, but it was quite important once.

With doubles:

4.0 * (4.0 * Math.Atan(0.2) - Math.Atan(1.0 / 239.0))

This will be accurate up to 14 decimal places, enough to fill a double (the inaccuracy is probably because the rest of the decimals in the arc tangents are truncated).

Also Seth, it's 3.141592653589793238463, not 64.

Calculate PI at compile-time with D.

( Copied from )

/** Calculate pi at compile time
 * Compile with dmd -c pi.d
module calcpi;

import meta.math;
import meta.conv;

/** real evaluateSeries!(real x, real metafunction!(real y, int n) term)
 * Evaluate a power series at compile time.
 * Given a metafunction of the form
 *  real term!(real y, int n),
 * which gives the nth term of a convergent series at the point y
 * (where the first term is n==1), and a real number x,
 * this metafunction calculates the infinite sum at the point x
 * by adding terms until the sum doesn't change any more.
template evaluateSeries(real x, alias term, int n=1, real sumsofar=0.0)
  static if (n>1 && sumsofar == sumsofar + term!(x, n+1)) {
     const real evaluateSeries = sumsofar;
  } else {
     const real evaluateSeries = evaluateSeries!(x, term, n+1, sumsofar + term!(x, n));

/*** Calculate atan(x) at compile time.
 * Uses the Maclaurin formula
 *  atan(z) = z - z^3/3 + Z^5/5 - Z^7/7 + ...
template atan(real z)
    const real atan = evaluateSeries!(z, atanTerm);

template atanTerm(real x, int n)
    const real atanTerm =  (n & 1 ? 1 : -1) * pow!(x, 2*n-1)/(2*n-1);

/// Machin's formula for pi
/// pi/4 = 4 atan(1/5) - atan(1/239).
pragma(msg, "PI = " ~ fcvt!(4.0 * (4*atan!(1/5.0) - atan!(1/239.0))) );

Pi is exactly 3! [Prof. Frink (Simpsons)]

Joke, but here's one in C# (.NET-Framework required).

using System;
using System.Text;

class Program {
    static void Main(string[] args) {
        int Digits = 100;

        BigNumber x = new BigNumber(Digits);
        BigNumber y = new BigNumber(Digits);
        x.ArcTan(16, 5);
        y.ArcTan(4, 239);
        string pi = x.ToString();

public class BigNumber {
    private UInt32[] number;
    private int size;
    private int maxDigits;

    public BigNumber(int maxDigits) {
        this.maxDigits = maxDigits;
        this.size = (int)Math.Ceiling((float)maxDigits * 0.104) + 2;
        number = new UInt32[size];
    public BigNumber(int maxDigits, UInt32 intPart)
        : this(maxDigits) {
        number[0] = intPart;
        for (int i = 1; i < size; i++) {
            number[i] = 0;
    private void VerifySameSize(BigNumber value) {
        if (Object.ReferenceEquals(this, value))
            throw new Exception("BigNumbers cannot operate on themselves");
        if (value.size != this.size)
            throw new Exception("BigNumbers must have the same size");

    public void Add(BigNumber value) {

        int index = size - 1;
        while (index >= 0 && value.number[index] == 0)

        UInt32 carry = 0;
        while (index >= 0) {
            UInt64 result = (UInt64)number[index] +
                            value.number[index] + carry;
            number[index] = (UInt32)result;
            if (result >= 0x100000000U)
                carry = 1;
                carry = 0;
    public void Subtract(BigNumber value) {

        int index = size - 1;
        while (index >= 0 && value.number[index] == 0)

        UInt32 borrow = 0;
        while (index >= 0) {
            UInt64 result = 0x100000000U + (UInt64)number[index] -
                            value.number[index] - borrow;
            number[index] = (UInt32)result;
            if (result >= 0x100000000U)
                borrow = 0;
                borrow = 1;
    public void Multiply(UInt32 value) {
        int index = size - 1;
        while (index >= 0 && number[index] == 0)

        UInt32 carry = 0;
        while (index >= 0) {
            UInt64 result = (UInt64)number[index] * value + carry;
            number[index] = (UInt32)result;
            carry = (UInt32)(result >> 32);
    public void Divide(UInt32 value) {
        int index = 0;
        while (index < size && number[index] == 0)

        UInt32 carry = 0;
        while (index < size) {
            UInt64 result = number[index] + ((UInt64)carry << 32);
            number[index] = (UInt32)(result / (UInt64)value);
            carry = (UInt32)(result % (UInt64)value);
    public void Assign(BigNumber value) {
        for (int i = 0; i < size; i++) {
            number[i] = value.number[i];

    public override string ToString() {
        BigNumber temp = new BigNumber(maxDigits);

        StringBuilder sb = new StringBuilder();

        int digitCount = 0;
        while (digitCount < maxDigits) {
            temp.number[0] = 0;
            sb.AppendFormat("{0:D5}", temp.number[0]);
            digitCount += 5;

        return sb.ToString();
    public bool IsZero() {
        foreach (UInt32 item in number) {
            if (item != 0)
                return false;
        return true;

    public void ArcTan(UInt32 multiplicand, UInt32 reciprocal) {
        BigNumber X = new BigNumber(maxDigits, multiplicand);
        reciprocal *= reciprocal;


        BigNumber term = new BigNumber(maxDigits);
        UInt32 divisor = 1;
        bool subtractTerm = true;
        while (true) {
            divisor += 2;
            if (term.IsZero())

            if (subtractTerm)
            subtractTerm = !subtractTerm;

This version (in Delphi) is nothing special, but it is at least faster than the version Nick Hodge posted on his blog :). On my machine, it takes about 16 seconds to do a billion iterations, giving a value of 3.1415926525879 (the accurate part is in bold).

program calcpi;



  start, finish: TDateTime;

function CalculatePi(iterations: integer): double;
  numerator, denominator, i: integer;
  sum: double;
  PI may be approximated with this formula:
  4 * (1 - 1/3 + 1/5 - 1/7 + 1/9 - 1/11 .......)
  numerator := 1;
  denominator := 1;
  sum := 0;
  for i := 1 to iterations do begin
    sum := sum + (numerator/denominator);
    denominator := denominator + 2;
    numerator := -numerator;
  Result := 4 * sum;

    start := Now;
    finish := Now;
    WriteLn('Seconds:' + FormatDateTime('hh:mm:ss.zz',finish-start));
    on E:Exception do
      Writeln(E.Classname, ': ', E.Message);

Back in the old days, with small word sizes and slow or non-existent floating-point operations, we used to do stuff like this:

/* Return approximation of n * PI; n is integer */
#define pi_times(n) (((n) * 22) / 7)

For applications that don't require a lot of precision (video games, for example), this is very fast and is accurate enough.

If you want to compute an approximation of the value of π (for some reason), you should try a binary extraction algorithm. Bellard's improvement of BBP gives does PI in O(N^2).

If you want to obtain an approximation of the value of π to do calculations, then:

PI = 3.141592654

Granted, that's only an approximation, and not entirely accurate. It's off by a little more than 0.00000000004102. (four ten-trillionths, about 4/10,000,000,000).

If you want to do math with π, then get yourself a pencil and paper or a computer algebra package, and use π's exact value, π.

If you really want a formula, this one is fun:

π = -i ln(-1)

Brent's method posted above by Chris is very good; Brent generally is a giant in the field of arbitrary-precision arithmetic.

If all you want is the Nth digit, the famous BBP formula is useful in hex

Calculating π from circle area :-)

<input id="range" type="range" min="10" max="960" value="10" step="50" oninput="calcPi()">
<div id="cont"></div>

function generateCircle(width) {
    var c = width/2;
    var delta = 1.0;
    var str = "";
    var xCount = 0;
    for (var x=0; x <= width; x++) {
        for (var y = 0; y <= width; y++) {
            var d = Math.sqrt((x-c)*(x-c) + (y-c)*(y-c));
            if (d > (width-1)/2) {
                str += '.';
            else {
                str += 'o';
            str += "&nbsp;" 
        str += "\n";
    var pi = (xCount * 4) / (width * width);
    return [str, pi];

function calcPi() {
    var e = document.getElementById("cont");
    var width = document.getElementById("range").value;
    e.innerHTML = "<h4>Generating circle...</h4>";
    setTimeout(function() {
        var circ = generateCircle(width);
        e.innerHTML  = "<pre>" + "π = " + circ[1].toFixed(2) + "\n" + circ[0] +"</pre>";
    }, 200);

Better Approach

To get the output of standard constants like pi or the standard concepts, we should first go with the builtins methods available the language that you are using. It will return value in the fastest way and best way also. I am using python to get the fastest way to get the value pi

  • pi variable of the math library. Math library store the variable pi as constant.

import math
print math.pi

Run the script with time utility of linux /usr/bin/time -v python


Command being timed: "python"
User time (seconds): 0.01
System time (seconds): 0.01
Percent of CPU this job got: 91%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.03
  • Use arc cos method of math

import math
print math.acos(-1)

Run the script with time utility of linux /usr/bin/time -v python


Command being timed: "python"
User time (seconds): 0.02
System time (seconds): 0.01
Percent of CPU this job got: 94%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.03

from decimal import Decimal, getcontext
print sum(1/Decimal(16)**k * 
          (Decimal(4)/(8*k+1) - 
           Decimal(2)/(8*k+4) - 
           Decimal(1)/(8*k+5) -
           Decimal(1)/(8*k+6)) for k in range(100))

Run the script with time utility of linux /usr/bin/time -v python


Command being timed: "python"
User time (seconds): 0.05
System time (seconds): 0.01
Percent of CPU this job got: 98%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.06

So best way is to use builtins method provided by the language cause they are the fastest and best to get the output. In python use math.pi

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow