Let's suppose I have a table in my database with <code>1.000.000</code> records. If I execute: <pre class="prettyprint"><code>SELECT * FROM [Table] LIMIT 1000 </code></pre> Will this query take the same time as if I have that table with <code>1000</code> records and just do: <pre class="prettyprint"><code>SELECT * FROM [Table] </code></pre> ? I'm not looking for if it will take exactly the same time. I just want to know if the first one will take much more time to execute than the second one. I said <code>1.000.000</code> records, but it could be <code>20.000.000</code>. That was just an example. Edit: Of course that when using LIMIT and without using it in the same table, the query built using LIMIT should be executed faster, but I'm not asking that... To make it generic: <code>Table1</code>: <code>X</code> records <code>Table2</code>: <code>Y</code> records <code>(X << Y)</code> What I want to compare is: <code>SELECT * FROM Table1</code> and <code>SELECT * FROM Table2 LIMIT X</code> Edit 2: Here is why I'm asking this: I have a database, with 5 tables and relationships between some of them. One of those tables will (I'm 100% sure) contain about <code>5.000.000</code> records. I'm using SQL Server CE 3.5, Entity Framework as the ORM and LINQ to SQL to make the queries. I need to perform basically three kind of non-simple queries, and I was thinking about showing to the user a limit of records (just like lot of websites do). If the user wants to see more records, the option he/she has is to restrict more the search. So, the question came up because I was thinking about doing this (limiting to <code>X</code> records per query) or if storing in the database only <code>X</code> results (the recent ones), which will require to do some deletions in the database, but I was just thinking... So, that table could contain <code>5.000.000</code> records or more, and what I don't want is to show the user <code>1000</code> or so, and even like this, the query still be as slow as if it would be returning the <code>5.000.000</code> rows.

<code>TAKE 1000</code> from a table of 1000000 records - will be 1000000/1000 (<code>= 1000</code>) times faster because it only needs to look at (and return) 1000/1000000 records. Since it does less, it is naturally faster. The result will be pretty (pseudo-)random, since you haven't specified any order in which to TAKE. However, if you do introduce an order, then one of two below becomes true: <ol> <li>The ORDER BY clause follows an index - the above statement is still true.</li> <li>The ORDER BY clause cannot use any index - it will be only marginally faster than without the TAKE, because <ul> <li>it has to inspect ALL records, and sort by <code>ORDER BY</code> </li> <li>deliver only a subset (TAKE count)</li> <li>so it is not faster in the first step, but the 2nd step involves less IO/network than ALL records</li> </ul> </li> </ol> If you TAKE 1000 records from a table of 1000 records, it will be equivalent (with little significant differences) to TAKE 1000 records from 1 billion, as long as you are following the case of (1) no order by, or (2) order by against an index

Assuming both tables are equivalent in terms of index, row-sizing and other structures. Also assuming that you are running that simple SELECT statement. If you have an <code>ORDER BY</code> clause in your SQL statements, then obviously the larger table will be slower. I suppose you're not asking that. If X = Y, then obviously they should run in similar speed, since the query engine will be going through the records in exactly the same order -- basically a table scan -- for this simple SELECT statement. There will be no difference in query plan. If Y > X only by a little bit, then also similar speed. However, if Y >> X (meaning Y has many many more rows than X), then the LIMIT version MAY be slower. Not because of query plan -- again should be the same -- but simply because that the internal structure of data layout may have several more levels. For example, if data is stored as leafs on a tree, there may be more tree levels, so it may take slightly more time to access the same number of pages. In other words, 1000 rows may be stored in 1 tree level in 10 pages, say. 1000000 rows may be stored in 3-4 tree levels in 10000 pages. Even when taking only 10 pages from those 10000 pages, the storage engine still has to go through 3-4 tree levels, which may take slightly longer. Now, if the storage engine stores data pages sequentially or as a linked list, say, then there will be no difference in execution speed.

How much is performance improved when using LIMIT in a SQL sentence?

Q: Does limit in SQL improve performance?

Yes, you will notice a performance difference when dealing with the data. One record takes up less space than multiple records.

Q: What does the limit clause do in SQL?

The LIMIT clause is used to specify the number of records to return. The LIMIT clause is useful on large tables with thousands of records. Returning a large number of records can impact performance.

Tags:

sql

database

take

Let's suppose I have a table in my database with 1.000.000 records.

If I execute:

SELECT * FROM [Table] LIMIT 1000

Will this query take the same time as if I have that table with 1000 records and just do:

SELECT * FROM [Table]

I'm not looking for if it will take exactly the same time. I just want to know if the first one will take much more time to execute than the second one.

I said 1.000.000 records, but it could be 20.000.000. That was just an example.

Edit:
Of course that when using LIMIT and without using it in the same table, the query built using LIMIT should be executed faster, but I'm not asking that...

To make it generic:

Table1: X records
Table2: Y records

(X << Y)

What I want to compare is:

SELECT * FROM Table1

and

SELECT * FROM Table2 LIMIT X

Edit 2:
Here is why I'm asking this:

I have a database, with 5 tables and relationships between some of them. One of those tables will (I'm 100% sure) contain about 5.000.000 records. I'm using SQL Server CE 3.5, Entity Framework as the ORM and LINQ to SQL to make the queries.

I need to perform basically three kind of non-simple queries, and I was thinking about showing to the user a limit of records (just like lot of websites do). If the user wants to see more records, the option he/she has is to restrict more the search.

So, the question came up because I was thinking about doing this (limiting to X records per query) or if storing in the database only X results (the recent ones), which will require to do some deletions in the database, but I was just thinking...

So, that table could contain 5.000.000 records or more, and what I don't want is to show the user 1000 or so, and even like this, the query still be as slow as if it would be returning the 5.000.000 rows.

301

asked Apr 19 '11 04:04

Oscar Mederos

2 Answers

TAKE 1000 from a table of 1000000 records - will be 1000000/1000 (= 1000) times faster because it only needs to look at (and return) 1000/1000000 records. Since it does less, it is naturally faster.

The result will be pretty (pseudo-)random, since you haven't specified any order in which to TAKE. However, if you do introduce an order, then one of two below becomes true:

The ORDER BY clause follows an index - the above statement is still true.
The ORDER BY clause cannot use any index - it will be only marginally faster than without the TAKE, because
- it has to inspect ALL records, and sort by ORDER BY
- deliver only a subset (TAKE count)
- so it is not faster in the first step, but the 2nd step involves less IO/network than ALL records

If you TAKE 1000 records from a table of 1000 records, it will be equivalent (with little significant differences) to TAKE 1000 records from 1 billion, as long as you are following the case of (1) no order by, or (2) order by against an index

110

answered Sep 29 '22 12:09

RichardTheKiwi

Assuming both tables are equivalent in terms of index, row-sizing and other structures. Also assuming that you are running that simple SELECT statement. If you have an ORDER BY clause in your SQL statements, then obviously the larger table will be slower. I suppose you're not asking that.

If X = Y, then obviously they should run in similar speed, since the query engine will be going through the records in exactly the same order -- basically a table scan -- for this simple SELECT statement. There will be no difference in query plan.

If Y > X only by a little bit, then also similar speed.

However, if Y >> X (meaning Y has many many more rows than X), then the LIMIT version MAY be slower. Not because of query plan -- again should be the same -- but simply because that the internal structure of data layout may have several more levels. For example, if data is stored as leafs on a tree, there may be more tree levels, so it may take slightly more time to access the same number of pages.

In other words, 1000 rows may be stored in 1 tree level in 10 pages, say. 1000000 rows may be stored in 3-4 tree levels in 10000 pages. Even when taking only 10 pages from those 10000 pages, the storage engine still has to go through 3-4 tree levels, which may take slightly longer.

Now, if the storage engine stores data pages sequentially or as a linked list, say, then there will be no difference in execution speed.

answered Sep 29 '22 11:09

Stephen Chung

Related questions
                            
                                Which language to use for scripting PostgreSQL? [closed]
                            
                                Best practice for archiving a huge table of over 1,000,000,000 rows
                            
                                Help with SQL statement (JOIN)
                            
                                What is so bad about using SQL INNER JOIN
                            
                                Excluding records based upon a one to many SQL join
                            
                                How do you escape double quotes inside a SQL fulltext 'contains' function?
                            
                                How to select the first continuous group of rows using Oracle SQL
                            
                                Restore SQL Server DB direct from another DB
                            
                                Apply a Mask to Format a String in SQL Server Query/View
                            
                                SQL Caching and Entity Framework
                            
                                SqlDataAdapter Output Variable Question C#
                            
                                Does adding a unique constraint slow down things?
                            
                                stored procedure returns nothing
                            
                                Microsoft LogParser: How to use parameters in a file
                            
                                Write INSERT statements with values next to column names?
                            
                                Calling table-valued-function for each result in query
                            
                                Where temp tables are located?
                            
                                MySQL: Smallest datatype for one bit
                            
                                MySQL - Save list order with single update
                            
                                Is there an alternative to the MULTISET operator that avoids sub-queries?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With