I have two very large lists and to loop through it once takes at least a second and I need to do it 200,000 times. What's the fastest way to remove duplicates in two lists to form one?
This is the fastest way I can think of:
import itertools
output_list = list(set(itertools.chain(first_list, second_list)))
Slight update: As jcd points out, depending on your application, you probably don't need to convert the result back to a list. Since a set is iterable by itself, you might be able to just use it directly:
output_set = set(itertools.chain(first_list, second_list))
for item in output_set:
# do something
Beware though that any solution involving the use of set()
will probably reorder the elements in your list, so there's no guarantee that elements will be in any particular order. That said, since you're combining two lists, it's hard to come up with a good reason why you would need a particular ordering over them anyway, so this is probably not something you need to worry about.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With