I needed to remove the characters in string1
which are present in string2
. Here string1
and string2
have only the lower case characters a-z with given condition that the length of string1
will be greater every time.
I was using the in
operator:
def removeChars (string1, string2):
for char in string2:
if char in string1:
string1 = string1.replace(char, '')
return string1
But I read one answer on Stack Overflow which says:
For container types such as list, tuple, set, frozenset, dict, or collections.deque, the expression
x in y
is equivalent toany(x is e or x == e for e in y)
.
Which means the in
operator is using a for
loop behind the scenes.
So my question is, in the for
loop in my code, should I consider that nested for
loops are being used, as the in
operator is using a for
loop in the background? If yes, what would be the time complexity of this program?
in
does not necessarily use loops behind the scenes. For example:
r = range(100000000000)
print(333 in r) # prints True immediately without looping
If you were to loop r
it will take quite a long time, so clearly that doesn't happen.
in
basically calls (behind the scenes) the object's __contains__
method. For some iterators it will in fact "loop" through everything but that's not always the case.
This example is basically the same as calling:
r.__contains__(333)
As pointed out in comments - the str
objects specifically have a smarter algorithm than just plain loops, as you can see here
Also see example answer here
And see the documentation here
Because the real world scenario would probably mean that string1
can be arbitrarily long, but the characters to be removed would be a finite and small set, it will probably be much more efficient to just add up all the characters that aren't in string2
. Something like this:
def removeChars (string1, string2):
result = ''
for char in string1:
if char not in string2:
result += char
return result
This will involve looping over string1
just once, but multiple checks against string2
using in
. This can be further simplified (to avoid +=
loop over the result):
def removeChars (string1, string2):
return ''.join(char for char in string1 if char not in string2)
When in
is equivalent to a loop, it has to 'walk' the iterator and compare each item in turn.
This means each loop is O(n)
, so for 2 levels this is O(n²)
https://wiki.python.org/moin/TimeComplexity
Note that you actually have 3 loops here - since your replace
will also walk through the string.
Since replace doens't raise any errors if char
isn't found, it is simpler to just do this without first testing char in string1
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With