Saturday, February 22, 2014

RDKit Knime Workflows IV: Rooted fingerprints

Another workflow from the RDKit workshop at the Knime UGM in Zürich

This workflow shows how to use the RDKit's rooted fingerprints. The idea of a rooted fingerprint is to allow calculation of a molecular fingerprint that only includes bits that are rooted at particular atoms. In effect, we're only including information about specific pieces of the molecule in the fingerprint.

The specific application in this workflow is to use rooted torsion fingerprints to pick sets of molecules with diverse F environments and diverse CF3 environments. We published the idea for this a few years ago here: Vulpetti, A., Hommel, U., Landrum, G., Lewis, R. & Dalvit, C. "Design and NMR-Based Screening of LEF, a Library of Chemical Fragments with Different Local Environment of Fluorine." J. Am. Chem. Soc. 131, 12949–12959 (2009) http://pubs.acs.org/doi/abs/10.1021/ja905207t.

Here's the full workflow:

The metanode in the middle identifies and labels molecules that have either a single -F or a single -CF3. For the molecules with a single -F (top branch from the metanode) we identify the F atoms using a Substructure Filter node:

And use those labels for the rooted fingerprints:

These are provided as input to the standard RDKit diversity picker node. Looking of the molecules picked shows that the F environments are, indeed, diverse (though the overall molecule diversity isn't that high):


Until I figure out how to host these in a sensible way on a public knime server, I'll just provide things via dropbox links.
Here's the data directory that is used in this and all other workflows: https://db.tt/qSdJn0St
And here's the link to this workflow: https://db.tt/Y7MCj2BK

No comments: