Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Converting molecule name to SMILES?

I was just wondering, is there any way to convert IUPAC or common molecular names to SMILES? I want to do this without having to manually convert every single one utilizing online systems. Any input would be much appreciated!

For background, I am currently working with python and RDkit, so I wasn't sure if RDkit could do this and I was just unaware. My current data is in the csv format.

Thank you!

like image 571
A. Y Avatar asked Feb 28 '19 16:02

A. Y


People also ask

How do you convert ChemDraw to SMILES?

In ChemDraw, draw the structure and then Select All. In the ChemDraw Edit menu, go to Copy As, and then select SMILES or use the shortcut Alt+Ctrl+C.

How do you convert PDB to SMILES?

Paste you SMILE in there and using using save as option save the structure in . sdf format. Open . sdf file in pymol and click on File>export molecules option and save your structure in .

What are SMILES in chemistry?

SMILES is the “Simplified Molecular Input Line Entry System,” which is used to translate a chemical's three-dimensional structure into a string of symbols that is easily understood by computer software. SMILES notation are used to enter chemical structure into EPI Suite™ estimation programs and ECOSAR.


2 Answers

RDKit cant convert names to SMILES. Chemical Identifier Resolver can convert names and other identifiers (like CAS No) and has an API so you can convert with a script.

from urllib.request import urlopen
from urllib.parse import quote

def CIRconvert(ids):
    try:
        url = 'http://cactus.nci.nih.gov/chemical/structure/' + quote(ids) + '/smiles'
        ans = urlopen(url).read().decode('utf8')
        return ans
    except:
        return 'Did not work'

identifiers  = ['3-Methylheptane', 'Aspirin', 'Diethylsulfate', 'Diethyl sulfate', '50-78-2', 'Adamant']

for ids in identifiers :
    print(ids, CIRconvert(ids))

Output

3-Methylheptane CCCCC(C)CC
Aspirin CC(=O)Oc1ccccc1C(O)=O
Diethylsulfate CCO[S](=O)(=O)OCC
Diethyl sulfate CCO[S](=O)(=O)OCC
50-78-2 CC(=O)Oc1ccccc1C(O)=O
Adamant Did not work
like image 84
rapelpy Avatar answered Nov 13 '22 00:11

rapelpy


OPSIN (https://opsin.ch.cam.ac.uk/) is another solution for name2structure conversion.

It can be used by installing the cli, or via https://github.com/gorgitko/molminer

(OPSIN is used by the RDKit KNIME nodes also)

like image 41
JoshuaBox Avatar answered Nov 13 '22 00:11

JoshuaBox