Friday, December 13, 2019

Using the R-Group Decomposition Code

The RDKit's code for doing R-group decomposition (RGD) is quite flexible but also rather "undocumented". Thanks to that fact, you may not be aware of some of the cool stuff that's there. This post is an attempt to at least begin to remedy that.
We'll look at a number of difficult/interesting problems that arise all the time when doing RGD on real-world datasets:
  • Handling symmetric cores
  • Handling stereochemistry
  • Handling sidechains that attach to the core at more than one point
  • Handling multiple scaffolds or variable scaffolds
Unfortunately, this is one of those posts that completely choke blogger. Rather than doing a bunch of editing to find the magic "just right" amount of content that I can include and still post, I will just include the nbviewer version as an iframe and point you to the original notebook for this post in github: https://github.com/greglandrum/rdkit_blog/blob/master/notebooks/RGroupEdgeCases.ipynb or here's the nbviewer link

Here's the nbviewer iframe: