Sunday, November 23, 2014

Switching to Python 3


Now that the RDKit supports Python 3, I am planning to make Python 3 my standard environment. Here are some notes I collected while doing so.

Rather than mess around on my own with trying to get a working Python 3 install that co-exists with Python 2, I'm going to use anaconda, which provides most every package I'd be interested in as well as a very capable package manager. Riccardo's conda-rdkit helps a lot. For those wanting a pre-built version of the RDKit, we've made binaries available via the RDKit binstar channel.

I'm want to set up a development environment where I can do my usual RDKit work, so the standard conda stuff doesn't help me. The rest of this post is about making that work.


Here's what I did on my linux box (Ubuntu 14.04).

After installing anaconda and ensuring that it's in the PATH, setup a python 3.4 environment:
conda create -n py34 python=3.4 anaconda
source activate py34 

Then grab a copy of conda-rdkit from github and, from within the conda-rdkit directory, build adnd install boost:
CONDA_PY=34 conda build boost
CONDA_PY=34 conda install --use-local boost

Note: if I'd been thinking about it, I probably could have skipped this very time-consuming step by installing the boost binaries from binstar.

Now go to the RDKit source directory, create a build dir, configure the build using cmake, and build normally (note, I'm buildnig with the optional InChI and avalontools support enabled) :
BOOST_ROOT=~/anaconda/envs/py34 cmake -D RDK_BUILD_SWIG_WRAPPERS=OFF \
 -D AVALONTOOLS_DIR=$RDBASE/External/AvalonTools/distrib/SourceDistribution \
 -D CMAKE_SYSTEM_PREFIX_PATH=~/anaconda/envs/py34 \
make -j4 install

To test the build, you need to have $RDBASE set and ensure your environment is properly configured:
declare -x LD_LIBRARY_PATH="$RDBASE/lib"

At this point running ctest should show all tests passing and you should have a working RDKit build.


[Update 20 December, 2014. Thanks to Pat for reminding me that I hadn't done this.]

The conda-rdkit recipes also support building on the Mac. I have not yet managed to get a free-standing build like what's described above working, but using the master version of the conda-rdkit recipe does now allow you to build a working RDKit version of the most recent release for either python 2.7 or 3.4. Pre-built MacOS binaries are also available via the RDKit binstar channel. We'll work on getting the development branch (which does a build based on the current github master) working and I will, hopefully, in a future post be able to document how to do a "standard" free-standing build working with anaconda on the Mac.


Dave said...

Hi Greg, is the above method still the best way to set up Rdkit for a Python 3 environment?

Greg Landrum said...

These days I would use the conda-rdkit package that Riccardo Vianello built to do the install.
There's a good writeup of this at the beginning of the new install documentation:
(will be in the next release)

That approach allows you to get a binary install without too much work and anaconda is a fantastic scientific python distribution.