Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Computational Complexity of SQL Query

If I have a table of, say, blog posts, with columns such as post_id and author_id, and I used the SQL "SELECT * FROM post_table where author_id = 34", what would be the computational complexity of that query? Would it simply look through each row and check if it has the correct author id, O(n), or does it do something more efficient?

I was just wondering because I'm in a situation where I could either search an SQL database with this data, or load an xml file with a list of posts, and search through those, I was wondering which would be faster.

like image 870
GoogieK Avatar asked Dec 30 '12 19:12

GoogieK


People also ask

Which SQL clause is used to reduce the complexity of the query?

The easiest way to make queries less complex is by eliminating OR clauses whenever possible. Because OR is inclusive, SQL Server has to process each component of the clause separately, which really slows down operations.

What is complexity in database?

Data complexity is the complexity of evaluating a query on a database instance, when the query is fixed, and we express the complexity as a function of the size of. the database. The other, called combined complexity, considers both the query and.

Why is SQL better for complex queries?

As your queries are complex, SQL is the way to go. MongoDB is what's known as a NoSQL database. It is very fast, however it sacrifices robustness and has weak relational guarantees. Contrasting, SQL is resistant to outages and guarantees consistency between queries.


2 Answers

There are two basic ways that such a simple query would be executed.

The first is to do a full table scan. This would have O(n) performance.

The second is a to look up the value in an index, then load the page, and return the results. The index scan should be O(log(n)). Loading the page should be O(1).

With a more complicated query, it would be hard to make such a general statement. But any SQL engine is generally going to take one of these two paths. Oh, there is a third option if the table is partitioned on author_id, but you are probably not interested in that.

That said, the power of a database is not in these details. It is in the management of memory. The database will cache the data and index in memory, so you do not have to re-read data pages. The database will take advantage of multiple processors and multiple disks, so you do not have to code this. The database keeps everything consistent, in the face of updates and deletes.

As for your specific question. If the data is in the database, search it there. Loading all the data into an xml file and then doing the search in memory requires a lot of overhead. You would only want to do that if the connection to your database is slow and you are doing many such queries.

like image 82
Gordon Linoff Avatar answered Oct 17 '22 20:10

Gordon Linoff


Have a look at the EXPLAIN command. It shows you what the database actually does when executing a given SELECT query.

like image 42
Tomas Andrle Avatar answered Oct 17 '22 19:10

Tomas Andrle