Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find shortest substring

I have written a code to find the substring from a string. It prints all substrings. But I want a substring that ranges from length 2 to 6 and print the substring of minimum length. Please help me

Program:

import re
p=re.compile('S(.+?)N')
s='ASDFANSAAAAAFGNDASMPRKYN'
s1=p.findall(s)
print s1

output:

['DFA', 'AAAAAFG', 'MPRKY']  

Desired output:

'DFA'  length=3

2 Answers

If you already have the list, you can use the min function with the len function as the second argument.

>>> s1 = ['DFA', 'AAAAAFG', 'MPRKY']
>>> min(s1, key=len)
'DFA'

EDIT:
In the event that two are the same length, you can extend this further to produce a list containing the elements that are all the same length:

>>> s2 = ['foo', 'bar', 'baz', 'spam', 'eggs', 'knight']
>>> s2_min_len = len(min(s2, key=len))
>>> [e for e in s2 if len(e) is s2_min_len]
['foo', 'bar', 'baz']

The above should work when there is only 1 'shortest' element too.

EDIT 2: Just to be complete, it should be faster, at least according to my simple tests, to compute the length of the shortest element and use that in the list comprehension. Updated above.

like image 138
Nick Presta Avatar answered Mar 08 '26 21:03

Nick Presta


The regex 'S(.{2,6}?)N' will give you only matches with length 2 - 6 characters.

To return the shortest matching substring, use sorted(s1, key=len)[0].

Full example:

import re
p=re.compile('S(.{2,6}?)N')
s='ASDFANSAAAAAFGNDASMPRKYNSAAN'
s1=p.findall(s)
if s1:
    print sorted(s1, key=len)[0]
    print min(s1, key=len) # as suggested by Nick Presta

This works by sorting the list returned by findall by length, then returning the first item in the sorted list.

Edit: Nick Presta's answer is more elegant, I was not aware that min also could take a key argument...

like image 33
codeape Avatar answered Mar 08 '26 23:03

codeape



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!