Friday, November 1, 2019

Introducing new RDKit JavaScript wrappers

A few years ago Guillaume Godin introduced RDKit.js (I think that was the first presentation) at an RDKit UGM. This was an ambitious project that provided JavaScript wrappers around most of the RDKit. It was a great indication of what types of things are possible, but was unfortunately really difficult to maintain and work with. So the project kind of died. Guillaume and I started talking about reviving it sometime last year, but after spending some time looking at updating the existing wrapper to a new version of the RDKit I, again, reached the conclusion that it wouldn't be supportable. 

Since we both really wanted to get the RDKit working in a browser, I came up with an alternate approach that would produce something that could be supported: rather than wrapping the entire toolkit we would expose a minimal, but useful, subset of RDKit functionality to JavaScript. Stuff like canonical SMILES generation, descriptors, fingerprints, substructure searching, molecule drawing, etc. This minimal wrapper is part of the 2019.09.1 RDKit release and you can try it out here: https://rdkit.org/temp/demo/demo.html

Some caveats about that demo:

  1. This is, obviously, just a temporary URL. I will come up with something more permanent and update this post, but in the meantime if you want to use the wrappers yourself, please download a copy from the URL instead of linking to it.
  2. I am most certainly not a web developer, so the page is pretty ugly and the underlying code will probably make an actual web developer's eyes burn. Pull requests are very welcome! Here's the source for that page: https://github.com/rdkit/rdkit/tree/Release_2019_09_1/Code/MinimalLib/demo
  3. Because we're using WebAssembly this only works in modern browsers. I can use it on my phone though... that's awesome!

Here are some highlights of what you can do with the demo.

Type in SMILES and see a drawing of the corresponding molecule.
Note that the canonical SMILES (under the drawing) and computed values update as I type:



Do SMARTS-based substructure searches and see the matching atoms and bonds highlighted on the molecule:

If you want to see everything available in the JS wrappers, you can open the JS console ("Developer tools" in Chrome) to see the functions that are available and try them out:

As I mentioned above, there's not a huge amount of functionality exposed, but I think that what's there is already pretty useful. Since we are trying to keep the interface reasonably minimal (and supportable), we're not going to put everything in there, but if you have suggestions for improvements or additions, please file an issue or just let me know.

I just love that I can now use the RDKit on my phone. :-)

About the technology

For people who care about such things, here's a very short description of how this works.
We use emscripten and LLVM to compile the RDKit's C++ code and produce WebAssembly (instead of the normal assembly language that we normally compile C++ code to) with a thin JavaScript interface layer. In order to have the wrapper code be as simple as possible, we're producing the wrappers themselves with embind, which has an interface that's quite similar to Boost::Python. In the end we need a surprisingly small amount of code to make this all work.

If you want to build the wrappers yourself, the easiest way to do so is to use the Dockerfile that is part of the RDKit distribution: https://github.com/rdkit/rdkit/blob/Release_2019_09_1/Code/MinimalLib/docker/Dockerfile
Getting emscripten and all the other dependencies installed and configured is not 100% trivial and this takes of all that for you.

1 comment:

Burri said...

Cool, I love this new version, thank you