Hacker News
A C++ implementation of a fast hash map and hash set using hopscotch hashing
nly
|next
[-]
The cacheline performance is pretty hard to beat (SIMD optimised linear scan before hopping), which is where all the wins come in the real world.
But basically any of the faster hash maps from absl, boost or folly are going to wreck the standard library in terms of perf
lefty2
|root
|parent
|next
[-]
jll29
|next
|previous
[-]
einpoklum
|next
|previous
[-]
----
An point often missed by people who need to/want to do hashing:
In practice, with your real workloads, you can often make do with actually "giving up" on the hasing of some fraction of the elements, whose buckets, neighborhoods and such are already occupied - and instead put those aside for separate out-of-band handling. hash table implementations such as this one (or std::unordered_map and all the rest), absolutely _must_ succeed in inserting your values - and so must always allow for more collisions, resizing etc.
stevefan1999
|next
|previous
[-]
It was 3 years later when I was in college I learned advanced data structures and came into Cuckoo Hashing, then Robinhood hash, and the combination of both Cuckoo and Robinhood hash => Hopscotch hashing
mgaunard
|next
|previous
[-]
Looks like the benchmarks were last updated in 2019.
compiler-guy
|root
|parent
[-]
Has some older benchmarks, including those two.
jeffbee
|root
|parent
|next
[-]
However, it lacks the newer Boost stuff which is very fast.
The Hopscotch map was interesting at the time but due to unfortunate timing was immediately outshone by absl::unordered_flat_map A.K.A. "Swiss tables", and there's been even more water under the bridge since then.
RossBencina
|root
|parent
|next
[-]
jeffbee
|root
|parent
[-]
quadrature
|root
|parent
|previous
[-]
reinitctxoffset
|root
|parent
|next
[-]
absl::flat_hash_map (or folly::F14) are great defaults if you can eat the invalidation semantics.
But if it's really hot you measure by workload and have infrastructure to flag the right ones in.
This seems promising. I'll start benching it alongside the other likely lads.
szmarczak
|root
|parent
|previous
[-]
infamouscow
|root
|parent
[-]
szmarczak
|root
|parent
[-]
I never claimed so. Please stop stating I said something when I didn't.
> as a general purpose hash table
That's what I claimed. The question IS about hash tables. If you want a hash table of any content, it's impossible to get faster. Unless you check all possible keys at once - only this will get you faster.