If I have a table of, say, blog posts, with columns such as post_id and author_id, and I used the SQL "SELECT * FROM post_table where author_id = 34", what would be the computational complexity of that query? Would it simply look through each row and check if it has the correct author id, O(n), or does it do something more efficient?
I was just wondering because I'm in a situation where I could either search an SQL database with this data, or load an xml file with a list of posts, and search through those, I was wondering which would be faster.
The easiest way to make queries less complex is by eliminating OR clauses whenever possible. Because OR is inclusive, SQL Server has to process each component of the clause separately, which really slows down operations.
Data complexity is the complexity of evaluating a query on a database instance, when the query is fixed, and we express the complexity as a function of the size of. the database. The other, called combined complexity, considers both the query and.
As your queries are complex, SQL is the way to go. MongoDB is what's known as a NoSQL database. It is very fast, however it sacrifices robustness and has weak relational guarantees. Contrasting, SQL is resistant to outages and guarantees consistency between queries.
There are two basic ways that such a simple query would be executed.
The first is to do a full table scan. This would have O(n) performance.
The second is a to look up the value in an index, then load the page, and return the results. The index scan should be O(log(n)). Loading the page should be O(1).
With a more complicated query, it would be hard to make such a general statement. But any SQL engine is generally going to take one of these two paths. Oh, there is a third option if the table is partitioned on author_id, but you are probably not interested in that.
That said, the power of a database is not in these details. It is in the management of memory. The database will cache the data and index in memory, so you do not have to re-read data pages. The database will take advantage of multiple processors and multiple disks, so you do not have to code this. The database keeps everything consistent, in the face of updates and deletes.
As for your specific question. If the data is in the database, search it there. Loading all the data into an xml file and then doing the search in memory requires a lot of overhead. You would only want to do that if the connection to your database is slow and you are doing many such queries.
Have a look at the EXPLAIN command. It shows you what the database actually does when executing a given SELECT query.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With