Derek Lowe blogged today about a really wild natural-product derivative that is currently a clincal candidate. I decided that I wanted to see what it looked like in 3D and, along the way, see how the coordinate generation code handles something this complex.
Here's the original paper.
Here's the original paper.
In [1]:
from rdkit import Chem
from rdkit.Chem import rdDepictor
from rdkit.Chem import rdDistGeom
from rdkit.Chem import Draw
from rdkit.Chem.Draw import IPythonConsole
IPythonConsole.ipython_3d = True
import time
import rdkit
print(rdkit.__version__)
print(time.asctime())
Start by reading the molecule in. I redrew this from the paper, so it's easily possible I got something wrong... please let me know if you see anything off.
In [2]:
# we will read the molecule in and keep the Hs since they were a pain to get right:
m = Chem.MolFromMolFile('../data/E7130.mol',removeHs=False)
m
Out[2]:
Make the drawing size a bit bigger:
In [3]:
IPythonConsole.molSize = (600,450)
m
Out[3]:
Start by seeing whether we perceive absolute stereo correctly or not:
In [5]:
Chem.FindMolChiralCenters(m)
Out[5]:
Five of those - 26, 38, 41, 42, and 49 - disagree with Marvin. As far as I can do the CIP rules on something this complicated I believe Marvin is correct. It's not that surprising that we'd get something wrong here (the RDKit's CIP assignments are definitely only approximate), but it would be nice if it were correct.
Let's see how the RDKit's coordinate generation does with this beast:
In [6]:
nm = Chem.Mol(m)
rdDepictor.Compute2DCoords(nm)
nm
Out[6]:
That's actually better than I expected! The fused ring system isn't pretty, but you can mostly tell what the structure is.
How does CoordGen do?
How does CoordGen do?
In [7]:
rdDepictor.SetPreferCoordGen(True)
rdDepictor.Compute2DCoords(nm)
nm
Out[7]:
There's room for improvement here. The big fused ring system at the left is hard, but it would be nice if there were less collisions, particularly of Hs.
Finally, let's take a look at it in 3D... which is what I wanted to do from the beginning:
In [10]:
IPythonConsole.molSize_3d = (600,600)
mh = Chem.AddHs(m)
rdDistGeom.EmbedMolecule(mh,randomSeed=0xf00d)
mh
Out[10]:
It is not exactly trivial to do the check, but I believe the stereochemistry matches the drawing, which is what we'd hope.
We can look and see what the RDKit's stereochemistry assignment does:
In [12]:
Chem.AssignStereochemistryFrom3D(mh)
for i,(c1,c2) in enumerate(zip(Chem.FindMolChiralCenters(mh),Chem.FindMolChiralCenters(m))):
if c1!=c2:
print(i,c1,c2)
Interestingly, the centers that differ here are the ones that differed from the assignment in Marvin Sketch. After assigning the stereo from 3D we match what Marvin provides. That's something to look into.
Finally, make sure the canonical SMILES for those two molecules are the same:
In [17]:
Chem.MolToSmiles(Chem.RemoveHs(m)) == Chem.MolToSmiles(Chem.RemoveHs(mh))
Out[17]: