Why implement a Hashtable with a Binary Search Tree?

Question 1

If the elements don't have a total order (i.e. the "greater than" and "less than" is not be defined for all pairs or it is not consistent between elements), you can't compare all pairs, thus you can't use a BST directly, but nothing's stopping you from indexing the BST by the hash value - since this is an integral value, it obviously has a total order (although you'd still need to resolve collision, that is have a way to handle elements with the same hash value).

However, one of the biggest advantages of a BST over a hash table is the fact that the elements are in order - if we order it by hash value, the elements will have an arbitrary order instead, and this advantage would no longer be applicable.

As for why one might consider implementing a hash table using a BST instead of an array, it would:

Not have the disadvantage of needing to resize the array - with an array, you typically mod the hash value with the array size and resize the array if it gets full, reinserting all elements, but with a BST, you can just directly insert the unchanging hash value into the BST.

This might be relevant if we want any individual operation to never take more than a certain amount of time (which could very well happen if we need to resize the array), with the overall performance being secondary, but there might be better ways to solve this problem.
Have a reduced risk of hash collisions since you don't mod with the array size and thus the number of possible hashes could be significantly bigger. This would reduce the risk of getting the worst-case performance of a hash table (which is when a significant portion of the elements hash to the same value).

What the actual worst-case performance is would depend on how you're resolving collisions. This is typically done with linked-lists for O(n) worst case performance. But we can also achieve O(log n) performance with BST's (as is done in Java's hash table implementation if the number of elements with some hash are above a threshold) - that is, have your hash table array where each element points to a BST where all elements have the same hash value.
Possibly use less memory - with an array you'd inevitably have some empty indices, but with a BST, these simply won't need to exist. Although this is not a clear-cut advantage, if it's an advantage at all.

If we assume we use the less common array-based BST implementation, this array will also have some empty indices and this would also require the occasional resizing, but this is a simply memory copy as opposed to needing to reinsert all elements with updated hashes.

If we use the typical pointer-based BST implementation, the added cost for the pointers would seemingly outweigh the cost of having a few empty indices in an array (unless the array is particularly sparse, which tends to be a bad sign for a hash table anyway).

But, since I haven't personally ever heard of this ever being done, presumably the benefits are not worth the increased cost of operations from expected O(1) to O(log n).

Typically the choice is indeed between using a BST directly (without hash values) and using a hash table (with an array).

Question 2

Pros:

Potentially use less space b/c we don't allocate a large array
Can iterate through the keys in order, sometimes useful

Cons:

You'd have O(log N) lookup time, which is worse than the guaranteed O(1) for a chained hash table.

Question 3

Since the requirements of a Hash Table are O(1) lookup, it's not a Hash Table if it has logarithmic lookup times. Granted, since collision is an issue with the array implementation (well, not likely an issue), using a BST could offer benefits in that regard. Generally, though, it's not worth the tradeoff - I can't think of a situation where you wouldn't want guaranteed O(1) lookup time when using a Hash Table.

Alternatively, there is the possibility of an underlying structure to guarantee logarithmic insertion and deletion via a BST variant, where each index in the array has a reference to the corresponding node in the BST. A structure like that could get sort of complex, but would guarantee O(1) lookup and O(logn) insertion/deletion.

Question 4

I found this looking to see if anyone had done it. I guess maybe not.

I came up with an idea this morning of implementing a binary tree as an array consisting of rows stored by index. Row 1 has 1, row 2 has 2, row 3 has 4 (yes, powers of two). The advantage of this structure is a bit shift and addition or subtraction can be used to walk the tree instead of using extra memory to store bi- or uni-directional references.

This would allow you to rapidly search for a hash value based on some sort of hashable input, to discover if the value exists in some other store. Or for a hash collision (or partial collision) search. I can't think of many other uses for it but for these it would be phenomenally fast. Very likely a lot of the rotation operations would happen entirely in cpu cache and be written out in nice linear blobs to main memory.

Its main utility would be with sorting input values of a random nature. If the blobs in the array were two parts, like a hash, and an identifier for another store, you could do the comparisons very fast and insert very fast to discover where an item bearing a hash value is kept in another location (like the UUID of a filesystem node or maybe even the filename, or other short identifiable string).

I'll leave it to others to dream of other ways to use it but I'm using it for a graph theoretic proof of work search table for identifying partial collisions for a variant of Cuckoo Cycle.

I am just now working on the walk formula, and here it is:

i = index of array element

Walk Up (go to parent):

i>>1-(i+1)%2

(Obviously you probably need to test if i is zero)

Walk Left (down and left):

i<<1+2

(this and the next would also need to test against 2^depth of the structure, so it doesn't walk off the edge and fall back to the root)

Walk Right (down and right):

i<<1+1

As you can see, each walk is a short formula based on the index. A bit shift and addition for going left and right, and a bit shift, addition and modulus for ascending. Two instructions to move down, 4 to move up (in assembler, or as above in C and other HLL operator notation)

edit: I can see from further commentary that the benefit of slashing the insert time definitely would be of benefit. But I don't think that a conventional vector based binary tree would provide nearly as much benefit as a dense version. A dense version, where all the nodes are in a contiguous array, when it is searched, naturally will travel in a linear fashion through the memory, which should help reduce cache misses and thus reduce the latency of the searches significantly, as well as the fact that there is a latency hit with memory in accessing randomly compared to streaming through blocks sequentially.

https://github.com/calibrae-project/bast/blob/master/pkg/bast/bast.go

This is my current state of a WiP to implement what I am calling a Bifurcation Array Search Tree. For the purpose of a fast insert/delete and not horribly slow search through a sorted collection of hashes, I think that this would be of quite large benefit for cases where there is a lot of data coming and going through the structure, or more to the point, beneficial for more realtime applications.