RDKit: August 2016

Updated 6 August 2016 to fix an incomplete sentence.

This one is a request for advice/expertise on performance tuning/compiler flag tweaking on Windows. The short story is that when the RDKit is built using Visual Studio on Windows it ends up being substantially slower than when it's built with g++ and run using the Windows Subsystem for Linux. This doesn't seem like it should be true, but I'm not an expert with either Visual Studio or Windows, so I'm asking for help.

Some more details:

When I've used the RDKit on Windows machines it has always seemed slower than it should. I've never really quantified that and so I've always just kind of shrugged and moved on. Now I've measured a real difference and I'd like to try and do something about it.

Some experiments that I did with Docker on Windows convinced me that the effect was real, but with the advent of Bash on Windows 10 (https://msdn.microsoft.com/en-us/commandline/wsl/install_guide) - an awesome thing, by the way - I have some real numbers.

The RDKit includes some code that I've used over the years to track the performance of some basic tasks. This script - https://github.com/rdkit/rdkit/blob/master/Regress/Scripts/timings.py - looks at a broad subset of RDKit functionality.

The tests are:

construct 1000 molecules from sdf
construct 1000 molecules from smiles
construct 823 fragment molecules from SMARTS (smiles really)
1000 x 100 HasSubstructMatch (100 from t3)
1000 x 100 GetSubstructMatches (100 from t3)
construct 428 queries from RLewis_smarts.txt
1000 x 428 HasSubstructMatch
1000 x 428 GetSubstructMatches
Generate canonical SMILES for 1000 molecules
Generate mol blocks for 1000 molecules
RECAP decomposition of the 1000 molecules
Generate 2D coordinates for the 1000 molecules
Generate 3D coordinates for the 1000 molecules
Optimize those conformations using UFF
Generate unique subgraphs of length 6 for the 1000 molecules
Generate RDK fingerprints for the 1000 molecules
Optimize the conformations above (test 13) using MMFF

Here are the results using the 2016.03 release conda builds (available from the rdkit channel in conda), the first line is the Windows build, the second is the Linux build, the tests were run directly after each other on the same laptop (a Dell XPS13 running Win10 Anniversary Edition):

0.8 || 0.4 || 0.1 || 1.2 || 1.3 || 0.0 || 4.2 || 4.2 || 0.2 || 0.3 || 7.4 || 0.3 || 7.5 || 18.5 || 2.2 || 1.4 || 41.7
0.6 || 0.3 || 0.1 || 0.9 || 1.0 || 0.0 || 3.1 || 3.2 || 0.1 || 0.2 || 6.3 || 0.3 || 6.2 || 15.2 || 2.1 || 1.0 || 29.8

that's a real difference.

The Windows build is done using build files generated by cmake. It's a release mode build with Visual Studio using the flags: "/MD /O2 /Ob2 /D NDEBUG" (those are the defaults that cmake creates).

It doesn't seem right to me that the code generated by Visual Studio and running under Windows should be so much slower than the code generated by g++ and running under the Windows Linux subsystem. I'm hoping, for the good of all of the users of the RDKit on Windows, to find a tweak for the Visual C++ command-line options that produces faster compild code.

For what it's worth, here's a different set of benchmarks, run on a larger set of molecules. The script is here (https://github.com/rdkit/rdkit/blob/master/Regress/Scripts/new_timings.py):

construct 50K molecules from SMILES
generate canonical SMILES for those
construct 10K molecules from SDF
construct 823 fragment molecules from SMARTS (smiles really)
60K x 100 HasSubstructMatch
60K x 100 GetSubstructMatches
construct 428 queries from RLewis_smarts.txt
60K x 428 HasSubstructMatch
60K x 428 GetSubstructMatches
Generate 60K mol blocks
BRICS decomposition of the 60K molecules
Generate 2D coords for the 60K molecules
Generate RDKit fingerpirnts for the 60K molecules
Generate Morgan (radius=2) fingerprints for the 60K molecules.

The timings show the same, at times dramatic, performance differences :

18.8 || 8.5 || 6.8 || 0.1 || 85.8 || 106.2 || 0.0 || 264.2 || 268.6 || 14.0 || 77.2 || 20.9 || 104.7 || 13.0
17.5 || 9.8 || 6.7 || 0.1 || 68.0 ||  74.2 || 0.0 || 204.6 || 208.2 ||  9.6 || 56.5 || 20.6 ||  89.0 ||  6.6

If you have thoughts about what's going on here, please comment here, reach out on twitter, google+, or linkedin, or post to the mailing list.
Thanks!

Thursday, August 4, 2016

A question: RDKit performance on Windows

About Me

Blog Archive