Use 32 per-type container lists instead of one#29
Conversation
* Reduce lock contention * Make lookup of a container for type faster Ideally there should be something like 1024 lists, but so large copy-paste won't look good.
|
Same question as before, do you have any reason to believe that contention here is a performance problem, or that this speeds things up? I'm all for internment to be highly optimized, but am doubtful as to whether this contention is an issue. It seems this will only help if multiple types are interned at high rates. |
General reply is in that that issue: #28 (comment) I didn't check this or that PRs specifically. I applied all optimizations I could think of, and it was measurable speedup. Some of the optimizations are probably useless. |
|
Any chance you could describe (or better, produce) a reasonable benchmark that might trigger this? We've got benchmarks, but none that have threads to trigger lock contention. I'd like to ensure we're improving some performance and not hurting the unthreaded case too much before applying. |
50 threads which continuously intern objects of 100 different types in random order. Better objects which take longer to intern (e. g. preallocated, but not prehashed long strings).
I may do it one day, but since we've forked the library, this is no longer a big issue for us. Sorry.
This PR creates very little overhead, much smaller than acquiring the lock or do the hashing. The other PR should have no negative effects. |
|
I've confirmed with benchmarks that this doesn't measurably decrease the single-threaded case, and gives a dramatic improvement when allocating Interns for several types on multiple threads. |
Ideally there should be something like 1024 lists, but so large
copy-paste won't look good.