Cut string within a specific pattern in python

Question

I have string of some length consisting of only 4 characters which are 'A,T,G and C'. I have pattern 'GAATTC' present multiple times in the given string. I have to cut the string at intervals where this pattern is.. For example for a string, 'ATCGAATTCATA', I should get output of

string one - ATCGA
string two - ATTCATA

I am newbie in using Python but I have come up with the following (incomplete) code:

seq = seq.upper()
str1 = "GAATTC"
seqlen = len(seq)
seq = list(seq)

for i in range(0,seqlen-1):
    site = seq.find(str1)
    print(site[0:(i+2)])

Any help would be really appreciated.

Serge · Accepted Answer

First lets develop your idea of using find, so you can figure out your mistakes.

seq = 'ATCGAATTCATAATCGAATTCATAATCGAATTCATA'
seq = seq.upper()
pattern = "GAATTC"
split_at = 2
seqlen = len(seq)
i = 0

while i < seqlen:
    site = seq.find(pattern, i)
    if site != -1:
       print(seq[i: site + split_at])
       i = site + split_at
    else:
       print seq[i:]
       break

Yet python string sports a powerful replace method that directly replaces fragments of string. The below snippet uses the replace method to insert separators when needed:

seq = 'ATCGAATTCATAATCGAATTCATAATCGAATTCATA'
seq = seq.upper()
pattern = "GA","ATTC"
pattern1 = ''.join(pattern) # 'GAATTC'
pattern2 = ' '.join(pattern) # 'GA ATTC'
splited_seq = seq.replace(pattern1, pattern2) # 'ATCGA ATTCATAATCGA ATTCATAATCGA ATTCATA'
print (splited_seq.split())

I believe it is more intuitive and should be faster then RE (which might have lower performance, depending on library and usage)

t.m.adam · Answer

Here is a simple solution :

seq = 'ATCGAATTCATA'
seq_split = seq.upper().split('GAATTC')
result = [ 
    (seq_split[i]  + 'GA') if i % 2 == 0 else ('ATTC' + seq_split[i]) 
    for i in range(len(seq_split)) if len(seq_split[i]) > 0 
]

Result :

print(result)
['ATCGA', 'ATTCATA']

Cut string within a specific pattern in python

Tags:

python

string

bioinformatics

Srk

2 Answers

Serge

t.m.adam

Recent Activity

Donate For Us

Cut string within a specific pattern in python

Tags:

python

string

bioinformatics

Srk

2 Answers

Serge

t.m.adam

Related questions

Recent Activity

Donate For Us