I want to remove an element from list, such that the element contains 'X' or 'N'. I have to apply for a large genome. Here is an example:
input:
codon=['AAT','XAC','ANT','TTA']
expected output:
codon=['AAT','TTA']  
Remove an item by value: remove() You can remove the first item from the list where its value is equal to the specified value with remove() . If the list contains more than one matching the specified value, only the first is deleted.
To remove elements from ArrayList based on a condition or predicate or filter, use removeIf() method. You can call removeIf() method on the ArrayList, with the predicate (filter) passed as argument. All the elements that satisfy the filter (predicate) will be removed from the ArrayList.
The remove() method removes the first matching element (which is passed as an argument) from the list. The pop() method removes an element at a given index, and will also return the removed item. You can also use the del keyword in Python to remove an element or slice from a list.
For basis purpose
>>> [x for x in ['AAT','XAC','ANT','TTA'] if "X" not in x and "N" not in x]
['AAT', 'TTA']
But if you have huge amount of data, I suggest you to use dict or set
And If you have many characters other than X and N, you may do like this
>>> [x for x in ['AAT','XAC','ANT','TTA'] if not any(ch for ch in list(x) if ch in ["X","N","Y","Z","K","J"])]
['AAT', 'TTA']
NOTE: list(x) can be just x, and ["X","N","Y","Z","K","J"] can be just "XNYZKJ", and refer gnibbler answer, He did the best one.
Another not fastest way but I think it reads nicely
>>> [x for x in ['AAT','XAC','ANT','TTA'] if not any(y in x for y in "XN")]
['AAT', 'TTA']
>>> [x for x in ['AAT','XAC','ANT','TTA'] if not set("XN")&set(x)]
['AAT', 'TTA']
This way will be faster for long codons (assuming there is some repetition)
codon = ['AAT','XAC','ANT','TTA']
def pred(s,memo={}):
    if s not in memo:
        memo[s]=not any(y in s for y in "XN")
    return memo[s]
print filter(pred,codon)
Here is the method suggested by James Brooks, you'd have to test to see which is faster for your data
codon = ['AAT','XAC','ANT','TTA']
def pred(s,memo={}):
    if s not in memo:
        memo[s]= not set("XN")&set(s)
    return memo[s]
print filter(pred,codon)
For this sample codon, the version using sets is about 10% slower
There is also the method of doing it using filter
    lst = filter(lambda x: 'X' not in x and 'N' not in x, list)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With