Comparing doubles using ULPs (Units in the last place)

Question 1

Well, since my original "insight" wasn't too helpful, here's another answer, that simply takes your existing functions and reorganizes them to minimize repetitive object constructor calls:

var ULPS2 = function(){
  var buf2 = new ArrayBuffer(8);
  var dataView2A = new DataView(buf2);
  var dataView2B = new DataView(buf2);
  var aInt=new Int64(0, 0), bInt=new Int64(0, 0);
  var sub;

  this.IsAlmostEqual=function(A,B){
    if (A==B) return true;
    dataView2A.setFloat64(0, A);
    aInt.lo = dataView2A.getInt32(4) | 0;
    aInt.hi = dataView2A.getInt32(0) | 0;
    dataView2B.setFloat64(0, B);
    bInt.lo = dataView2B.getInt32(4) | 0;
    bInt.hi = dataView2B.getInt32(0) | 0;

    if(aInt.hi < 0) aInt = subtract(Int64_MinValue, aInt);

    if(bInt.hi < 0) bInt = subtract(Int64_MinValue, bInt);
    sub = subtract(aInt, bInt);
    if (sub.hi < 0) sub = negate(sub);
    if (lessthan(sub, maxUlps)) return true;
    return false;
  }

}
var Ulps2=new ULPS2();

Based on a test in Chrome and Firefox at http://jsbin.com/IWoyIDO/2/ , it appears to yield a 30%-50% improvement. Not phenomenal, but at least better than the 5-10% improvement you were originally mentioning.

Question 2

There is one way that would be faster for sure (currently as fast as 1.5 times the speed of native code), which would be to use asm.js, a highly optimizable subset of Javascript.

This is the unreal gaming engine running on it, to have an idea of the type of performance (takes a bit to load but runs very smoothly, use firefox or chrome).

The way it works is that you take your code and write in a statically typed language: C or C++. Then compile it into bytecode of the LLVM C++ virtual machine. Then use the emscripten compiler to generate asm.js Javascript code.

If the browser does not detect and optimize asm.js, it runs it as Javascript. If the browser (like latest versions of Chrome/Firefox) detects asm.js, you will have a near native performance.

Check the blog post of John Resig (creator of jQuery) about it. You can't get faster than that in a browser these days, and it keeps getting faster (see Mozilla team blog post here).

Question 3

Here is the basic gist of how to improve the performance using new browser functionality of typed arrays. Note that because no native 64-bit int view is provided, you will have to rework the logic on your own based on 2 32-bit ints, and ensure endianness does not affect the calculation on different systems. I am just trying to address the javascript speed portion of the problem, not be an expert on float encoding. Hopefully this is clear:

var IsAlmostEqual_Ulps = function(A, B)
{
  var smallBufferA = new ArrayBuffer(8); //8 bytes = 64bits
  var viewAsDoubleA = new Float64Array(smallBufferA); //byteOffset=0, length=8 optional
  var viewAsIntA = new Int32Array(smallBufferA); //byteOffset=0, length=8 optional
  viewAsDoubleA.set([A]);
  console.log(viewAsIntA);
  var smallBufferB = new ArrayBuffer(8); //8 bytes = 64bits
  var viewAsDoubleB = new Float64Array(smallBufferB); //byteOffset=0, length=8 optional
  var viewAsIntB = new Int32Array(smallBufferB); //byteOffset=0, length=8 optional
  viewAsDoubleB.set([B]);
  console.log(viewAsIntB);
  //The logic below is just illustrative and may not be right. Please test & adjust
  var A_lo=viewAsIntA[1];
  var B_lo=viewAsIntB[1];
  var A_hi=viewAsIntA[0];
  var B_hi=viewAsIntB[0];
  if(A_hi<0){A_hi=Int32_MinValue-A_hi;}
  if(A_hi<0){A_hi=Int32_MinValue-A_hi;}
  //You'll have to rewrite these slightly, I think you know how
  var sub = subtract(viewAsIntA, viewAsIntB); 
  if (sub.hi < 0) sub = negate(sub);
  if (lessthan(sub, maxUlps)) return true;
}

For example, if you called IsAlmostEqual_Ulps(3.14159,3.14159001), you would get logged in your console:

[-266631570, 1074340345]
[-244113572, 1074340345]

That should get you the biggest performance gains. If you want to make it even faster, you should reuse existing buffers (from the global scope or a closure) between function calls to not incur the constructor overhead and the garbage collection.