Why did the language designers of Java preferred chaining over open addressing for most hash based structures except for some like ThreadLocal? [closed]

StackOverflow https://stackoverflow.com/questions/12019434

  •  27-06-2021
  •  | 
  •  

Question

I know the difference between Open Addressing and Chaining for resolving hash collisions . Most of the basic hash based data structures like HashSet,HashMap in Java primarily use chaining technique. I read that ThreadLocal actually uses a probing scheme . So I want to understand why is open addressing not so much used in Java ? I mean it would be difficult to delete records using that scheme , in the sense that you have to mark those cells with some special handling . However it seems like memory requirement will be low for open addressing scheme.

Edit : I just want to understand the possible major reason/reasons for this design decision . I do not want finer details . Also I would like to know why ThreadLocal uses the lesser common technique of open addressing . I guess the two answers can be related together . So I prefer to ask in the same question itself.

Was it helpful?

Solution

I am currently discussing memory-compact reimplementations of HashMap and HashSet with, among others, Doug Lea. This particular question hasn't come up, but here's my first thoughts on the question...

  • Chained hash tables degrade reasonably gracefully. Whether it's higher load factors or lots of hash collisions, chaining doesn't degrade nearly as quickly as open addressing can.
  • As you've said, remove is...not a pleasant operation on open-addressed tables. As a general rule, remove is the least common operation on hash tables, but there are applications for which it's more common, and bad performance would be noticed.
  • I also suspect -- though I don't have much data -- that implementing a "linked" open-addressed hash table would be noticeably more difficult. LinkedHashMap is written as a subclass of HashMap, and borrows most of the implementation details; it's somewhat easier to implement the linked list of entries when the entries are discrete objects -- and at that point, you're already most of the way to a chained implementation.
  • Nothing in the spec ties them to this implementation -- they're always free to mess around with it later.
  • The JDK collections libraries...don't make memory consumption an especially high priority. Memory is cheap. (You may or may not agree with this, but it's definitely a noticeable trend.)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top