Background
One of the focus points for the 2015.03 RDKit release is improving performance. To this end we've made changes that mitigate or remove some of the performance bottlenecks. These include, among others, modifications to the way SMILES are generated, rearranging the way the molecular GetProp/SetProp interface is used internally, and making the RDKit molecule smaller so that less memory is required. There are a couple of other changes coming; I think there should be a nice increase in the speed of common operations when the new version is released.Getting something for nothing
Brian Kelley pointed out that using tcmalloc instead of the system-provided malloc implementation can lead to big speedups. It's super-easy to test (just addLD_PRELOAD=/usr/local/lib/libtcmalloc.so
to the command line, so I gave it a try with the RDKit. Wow did it make a difference!Many of the tests in the RDKit's basic python performance suite run too quickly to really be able to say much about performance, so I created a second performance suite that runs larger tests where it makes sense. This isn't yet complete - I need to add some reasonably sized tests of the conformation generation and force fields - but it's a decent start.
Here's a performance comparison for the current trunk status. The tests were run on my linux box (a three year old Dell Studio XPS) running Unbuntu 14.04.
test | default | with tcmalloc | fraction |
---|---|---|---|
50K mols from SMILES | 21.7 | 12.7 | 0.59 |
generate SMILES | 12.7 | 6.5 | 0.51 |
10x1K mols from SDF | 8.2 | 5.6 | 0.68 |
823 queries from SMILES | 0.1 | 0.1 | 1.00 |
HasSubstructMatch | 102.0 | 80.9 | 0.79 |
GetSubstructMatches | 115.3 | 91.9 | 0.80 |
428 queries from SMARTS | 0.0 | 0.0 | 0.0 |
HasSubstructMatch | 287.0 | 239.6 | 0.83 |
GetSubstructMatches | 288.2 | 240.8 | 0.84 |
generate Mol blocks | 37.8 | 24.5 | 0.65 |
BRICS decomposition | 79.4 | 53.6 | 0.68 |
generate 2D coords | 27.1 | 23.9 | 0.88 |
generate RDKit fingerprints | 148.4 | 80.8 | 0.54 |
generate Morgan fingerprints | 7.5 | 3.8 | 0.51 |
It is, unfortunatetly, not possible to make using tcmalloc the default at RDKit build time: this would require that other programs using the RDKit shared libraries (python, postgresql, etc.) also be re-compiled to use tcmalloc. It's probably also not safe to use the LD_PRELOAD trick in your .bashrc, but setting it before starting a long-running process seems like it definitely could be a win.
1 comment:
A bit late, but for posterity under OSX, FaceBook's jemalloc is a better replacement than tcmalloc. Homebrew has this, so
> brew install jemalloc
> DYLD_INSERT_LIBRARIES=/usr/local/lib/libjemalloc.dylib
and you are off to the races.
Post a Comment