Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

select * vs select column

People also ask

What is difference between select * and select column name?

The reason I use SELECT COLUMN_NAMES is when using Stored Procedure, adding columns to the table will not screw your application. select * will give additional column (which you've just added to the table) and application will get additional column and may raise error.

Is Select * faster than select column?

Selecting distinct and less than all columns will always be faster than selecting *.

What is the difference between select and select *?

4 Answers. Show activity on this post. select * will return values for all the columns in the table that have rows that match your predicate, while select column_name will only return the values in the column_name column for the rows that match your predicate.

Is Select * Same as select?

SELECT ALL means ALL rows, i.e including duplicate rows. (The opposite is SELECT DISTINCT , where duplicate rows are removed.) ALL is the default, and most people write just SELECT instead of SELECT ALL . SELECT * means all columns.


There are several reasons you should never (never ever) use SELECT * in production code:

  • since you're not giving your database any hints as to what you want, it will first need to check the table's definition in order to determine the columns on that table. That lookup will cost some time - not much in a single query - but it adds up over time

  • if you need only 2/3 of the columns, you're selecting 1/3 too much data which needs to be retrieving from disk and sent across the network

  • if you start to rely on certain aspects of the data, e.g. the order of the columns returned, you could get a nasty surprise once the table is reorganized and new columns are added (or existing ones removed)

  • in SQL Server (not sure about other databases), if you need a subset of columns, there's always a chance a non-clustered index might be covering that request (contain all columns needed). With a SELECT *, you're giving up on that possibility right from the get-go. In this particular case, the data would be retrieved from the index pages (if those contain all the necessary columns) and thus disk I/O and memory overhead would be much less compared to doing a SELECT *.... query.

Yes, it takes a bit more typing initially (tools like SQL Prompt for SQL Server will even help you there) - but this is really one case where there's a rule without any exception: do not ever use SELECT * in your production code. EVER.


It always pulls a tuple (except in cases where the table has been vertically segmented - broken up into columns pieces), so, to answer the question you asked, it doesn't matter from a performance perspective. However, for many other reasons, (below) you should always select specifically those columns you want, by name.

It always pulls a tuple, because (in every vendors RDBMS I am familiar with), the underlying on-disk storage structure for everything (including table data) is based on defined I/O Pages (in SQL Server for e.g., each Page is 8 kilobytes). And every I/O read or write is by Page.. I.e., every write or read is a complete Page of data.

Because of this underlying structural constraint, a consequence is that Each row of data in a database must always be on one and only one page. It cannot span multiple Pages of data (except for special things like blobs, where the actual blob data is stored in separate Page-chunks, and the actual table row column then only gets a pointer...). But these exceptions are just that, exceptions, and generally do not apply except in special cases ( for special types of data, or certain optimizations for special circumstances)
Even in these special cases, generally, the actual table row of data itself (which contains the pointer to the actual data for the Blob, or whatever), it must be stored on a single IO Page...

EXCEPTION. The only place where Select * is OK, is in the sub-query after an Exists or Not Exists predicate clause, as in:

   Select colA, colB
   From table1 t1
   Where Exists (Select * From Table2
                 Where column = t1.colA)

EDIT: To address @Mike Sherer comment, Yes it is true, both technically, with a bit of definition for your special case, and aesthetically. First, even when the set of columns requested are a subset of those stored in some index, the query processor must fetch every column stored in that index, not just the ones requested, for the same reasons - ALL I/O must be done in pages, and index data is stored in IO Pages just like table data. So if you define "tuple" for an index page as the set of columns stored in the index, the statement is still true.
and the statement is true aesthetically because the point is that it fetches data based on what is stored in the I/O page, not on what you ask for, and this true whether you are accessing the base table I/O Page or an index I/O Page.

For other reasons not to use Select *, see Why is SELECT * considered harmful? :


You should always only select the columns that you actually need. It is never less efficient to select less instead of more, and you also run into fewer unexpected side effects - like accessing your result columns on client side by index, then having those indexes become incorrect by adding a new column to the table.

[edit]: Meant accessing. Stupid brain still waking up.


Unless you're storing large blobs, performance isn't a concern. The big reason not to use SELECT * is that if you're using returned rows as tuples, the columns come back in whatever order the schema happens to specify, and if that changes you will have to fix all your code.

On the other hand, if you use dictionary-style access then it doesn't matter what order the columns come back in because you are always accessing them by name.


This immediately makes me think of a table I was using which contained a column of type blob; it usually contained a JPEG image, a few Mbs in size.

Needless to say I didn't SELECT that column unless I really needed it. Having that data floating around - especially when I selected mulitple rows - was just a hassle.

However, I will admit that I otherwise usually query for all the columns in a table.