Problem statement:
I am working on a problem where I have a database with a huge list of files from filesystem. If a bunch of files are deleted from the system, the same should be updated in database.
Approach:
Query list of files from db and list of files from filesystem. Then compare if each of the files from db is in the other list. Delete if not found To avoid a lookup of each file from the list repeatedly, I am planning to use sets in python and the difference_update() method
Question:
Internally, will this again have the complexity of O(m X n), like the other approach of repeated searching or is it optimized to reduce the complexity ?
Python Set difference_update() Method The difference_update() method is different from the difference() method, because the difference() method returns a new set, without the unwanted items, and the difference_update() method removes the unwanted items from the original set.
Python Set | difference_update() The previously discussed set difference() helps to find out the difference between two sets and returns a new set with the difference value, but the difference_update() updates the existing caller set.
Comparing lists in Python The set() function creates an object that is a set object. The cmp() function is used to compare two elements or lists and return a value based on the arguments passed.
It's going to be O(len(t))
as stated in the comment, because of the set constant look up time.
Also confirmed in http://python-reference.readthedocs.org/en/latest/docs/sets/difference_update.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With