Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SMILES from graph

Is there a method or package that converts a graph (or adjacency matrix) into a SMILES string?

For instance, I know the atoms are [6 6 7 6 6 6 6 8] ([C C N C C C C O]), and the adjacency matrix is

[[ 0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.],

 [ 1.,  0.,  2.,  0.,  0.,  0.,  0.,  1.],

 [ 0.,  2.,  0.,  1.,  0.,  0.,  0.,  0.],

 [ 0.,  0.,  1.,  0.,  1.,  0.,  0.,  0.],

 [ 0.,  0.,  0.,  1.,  0.,  1.,  0.,  0.],

 [ 0.,  0.,  0.,  0.,  1.,  0.,  1.,  1.],

 [ 0.,  0.,  0.,  0.,  0.,  1.,  0.,  0.],

 [ 0.,  1.,  0.,  0.,  0.,  1.,  0.,  0.]]

I need some function to output 'CC1=NCCC(C)O1'.

It also works if some function can output the corresponding "mol" object. The RDkit software has a 'MolFromSmiles' function. I wonder if there is something like 'MolFromGraphs'.

like image 202
Joe Avatar asked Feb 03 '23 23:02

Joe


1 Answers

Here is a simple solution, to my knowledge there is no built-in function for this in RDKit.

def MolFromGraphs(node_list, adjacency_matrix):

    # create empty editable mol object
    mol = Chem.RWMol()

    # add atoms to mol and keep track of index
    node_to_idx = {}
    for i in range(len(node_list)):
        a = Chem.Atom(node_list[i])
        molIdx = mol.AddAtom(a)
        node_to_idx[i] = molIdx

    # add bonds between adjacent atoms
    for ix, row in enumerate(adjacency_matrix):
        for iy, bond in enumerate(row):

            # only traverse half the matrix
            if iy <= ix:
                continue

            # add relevant bond type (there are many more of these)
            if bond == 0:
                continue
            elif bond == 1:
                bond_type = Chem.rdchem.BondType.SINGLE
                mol.AddBond(node_to_idx[ix], node_to_idx[iy], bond_type)
            elif bond == 2:
                bond_type = Chem.rdchem.BondType.DOUBLE
                mol.AddBond(node_to_idx[ix], node_to_idx[iy], bond_type)

    # Convert RWMol to Mol object
    mol = mol.GetMol()            

    return mol
Chem.MolToSmiles(MolFromGraphs(nodes, a))

Out:
'CC1=NCCC(C)O1'

This solution is a simplified version of https://github.com/dakoner/keras-molecules/blob/dbbb790e74e406faa70b13e8be8104d9e938eba2/convert_rdkit_to_networkx.py

There are many other atom properties (such as Chirality or Protonation state) and bond types (Triple, Dative...) that may need to be set. It is better to keep track of these explicitly in your graph if possible (as in the link above), but this function can also be extended to incorporate these if required.

like image 142
JoshuaBox Avatar answered Feb 06 '23 14:02

JoshuaBox