Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Converting SMILES to chemical name or IUPAC name using rdkit or other python module

Tags:

Is there a way to convert SMILES to either chemical name or IUPAC name using RDKit or other python modules?

I couldn't find something very helpful in other posts.

Thank you very much!

like image 934
Alex Avatar asked Oct 13 '20 05:10

Alex


1 Answers

As far as I am aware this is not possible using rdkit, and I do not know of any python modules with this ability. If you are ok with using a web service you could use the NCI resolver.

Here is a naive implementation of a function to retrieve an IUPAC identifier from a SMILES string:

import requests


CACTUS = "https://cactus.nci.nih.gov/chemical/structure/{0}/{1}"


def smiles_to_iupac(smiles):
    rep = "iupac_name"
    url = CACTUS.format(smiles, rep)
    response = requests.get(url)
    response.raise_for_status()
    return response.text


print(smiles_to_iupac('c1ccccc1'))
print(smiles_to_iupac('CC(=O)OC1=CC=CC=C1C(=O)O'))

[Out]:
BENZENE
2-acetyloxybenzoic acid

You could easily extend it to convert multiple different formats, although the function isn't exactly fast...

Another solution is to use PubChem. You can use the API with the python package pubchempy. Bear in mind this may return multiple compounds.

import pubchempy


# Use the SMILES you provided
smiles = 'O=C(NCc1ccc(C(F)(F)F)cc1)[C@@H]1Cc2[nH]cnc2CN1Cc1ccc([N+](=O)[O-])cc1'
compounds = pubchempy.get_compounds(smiles, namespace='smiles')
match = compounds[0]
print(match.iupac_name)

[Out]:
(6S)-5-[(4-nitrophenyl)methyl]-N-[[4-(trifluoromethyl)phenyl]methyl]-3,4,6,7-tetrahydroimidazo[4,5-c]pyridine-6-carboxamide
like image 141
Oliver Scott Avatar answered Sep 30 '22 18:09

Oliver Scott