[NCLUG] Re: [Python-Dev] Judy for replacing internal dictionaries?
dougbaskins at frii.com
Mon Aug 5 14:41:52 MDT 2002
> [Tim Peters]
> I can only say that we've measured 25% speedups on many platforms after
> switching to Vladimir's small-block allocator, which he designed after
> months of studying memory traces of Python's memory use (it's not a "general
> purpose" allocator, it's highly tuned to Python's specific needs).
> This includes platforms with a modern glibc, where it seems all too easy to
> provoke the system *free()* into seemingly quadratic-time behavior when
> freeing gazillions of small blocks. Speedups in those cases are basically
> unmeasurable, because few people are willing to wait an hour for their
> program to shut down <0.1 wink>. Note that Python programs have an
> extremely high rate of dynamic memory throughput, and I doubt Judy provokes
> that behavior to nearly the extent that Python programs do.
I would encourage you to try dlmalloc (VERSION 2.7.0 by Doug Lea) first.
BTW, with a poor malloc, it may take 10X longer to free a Judy array
then it took to build it.
Judy can be very tough on malloc() with very large arrays and specific
types of data (random - flat spectrum). After about 4 months of
measurements with about every malloc I could get my hands on, I
concluded that the modern machine solved the small block problem.
However, the free() is another story. I found all mallocs (that
coalesce memory) have very serious problems in this area. I found the
dlmalloc was the best of the bunch, but had a flaw. I investigated why.
The problem occurs when a huge flurry of small blocks (3..65 words) are
freed to produce large (coalesced) internal blocks of several Kbytes.
Dlmalloc may put thousands of these "several Kbytes" blocks of slightly
different sizes on the SAME free list. I have seen profiles of a 3
instruction loop dominate the whole profile (100X). I believe the list
in in size order and the 3 instruction loop is just looking for the
right place to insert the newly coalesced block. Solutions may include
making the list a skiplist, or make many more lists in the Kbyte sizes.
Haven't tried anything yet, but I would think it is very solvable
Doug Baskins <doug at sourcejudy.com>
More information about the NCLUG