I want to lookup and compare efficiently the string elements in a list and then remove those which are parts of other string elements in the list (with the same beginning point)
list1 = [ 'a boy ran' , 'green apples are worse' , 'a boy ran towards the mill' , ' this is another sentence ' , 'a boy ran towards the mill and fell',.....]
I intend to get a list which looks like this:
list2 = [ 'green apples are worse' , ' this is another sentence ' , 'a boy ran towards the mill and fell',.....]
In other words, I want to keep the longest string element from those elements which start with the same first characters.
The remove() method will remove the first instance of a value in a list. The pop() method removes an element at a given index, and will also return the removed item. You can also use the del keyword in Python to remove an element or slice from a list.
append (item): This method is used to add new element at the end of the list. extend (anotherList): The items of one list can be inserted at the end of another list by using this method. remove (item): This method is used to remove particular item from the list.
The remove() Method Removes the First Occurrence of an Item in a List. A thing to keep in mind when using the remove() method is that it will search for and will remove only the first instance of an item.
Java List remove() method is used to remove elements from the list.
As suggested by John Coleman in comments, you can first sort the sentences and then compare consecutive sentences. If one sentences is a prefix of another, it will appear right before that sentences in the sorted list, so we just have to compare consecutive sentences. To preserve the original order, you can use a set
for quickly looking up the filtered elements.
list1 = ['a boy ran', 'green apples are worse',
'a boy ran towards the mill', ' this is another sentence ',
'a boy ran towards the mill and fell']
srtd = sorted(list1)
filtered = set(list1)
for a, b in zip(srtd, srtd[1:]):
if b.startswith(a):
filtered.remove(a)
list2 = [x for x in list1 if x in filtered]
Afterwards, list2
is the following:
['green apples are worse',
' this is another sentence ',
'a boy ran towards the mill and fell']
With O(nlogn) this is considerably faster than comparing all pairs of sentences in O(n²), but if the list is not too long, the much simpler solution by Vicrobot will work just as well.
This is a way you can achieve that:-
list1 = [ 'a boy ran' , 'green apples are worse' , 'a boy ran towards the mill' , ' this is another sentence ' , 'a boy ran towards the mill and fell']
list2 = []
for i in list1:
bool = True
for j in list1:
if id(i) != id(j) and j.startswith(i): bool = False
if bool: list2.append(i)
>>> list2
['green apples are worse', ' this is another sentence ', 'a boy ran towards the mill and fell']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With