-
Notifications
You must be signed in to change notification settings - Fork 37
Description
This ticket is based on GaloisInc/saw-script#674.
We recently switched one of our projects from from Data.HashTable.ST.Cuckoo to Data.HashTable.ST.Basic, to avoid a bug in Cuckoo (related to #55, I believe). As a result, we noticed a surprisingly large performance hit. Here are some numbers from profiling a part of our test suite that makes heavy use of hash table operations, where I compared runs with Cuckoo and Basic hash table libraries:
- Each version did 4.8 million hash table insertions:
- cuckoo: 4.3% of 285.39s runtime = 12.3s
- basic: 32.1% of 1011.48 runtime = 325s
- 26x slowdown
- Each version did 8.5 million hash table lookups:
- cuckoo: 1.4% of 285.39s runtime = 4.0s
- basic: 40.4% of 1011.48s runtime = 408s
- 102x slowdown
This was especially surprising considering that the hashtables documentation claims Basic should have the fastest lookups. Profiling showed that resizing of the Basic hash table happened 42 times, taking up about 10% of total runtime, so resizing only accounts for a small portion of the slowdown.
To see whether profiling itself had anything to do with the numbers, I recorded a trace of the keys for all insert and lookup operations we were doing, and then ran a testbench that only performed the hash table operations. Timing (without profiling) shows that after parsing, Cuckoo takes an additional ten seconds or so, while Basic takes approximately an additional 8 minutes. So the slowdown ratio indicated by profiling is not far off.
These tests were done on MacOS with ghc-8.8.3; I haven't done thorough testing with other compiler versions.