Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Oracle row count of table by count(*) vs NUM_ROWS from DBA_TABLES

Tags:

sql

count

oracle

Looks like count(*) is slower than NUM_ROWS. Can experts in this area throw some light on this.

like image 986
Srujan Kumar Gulla Avatar asked Jan 02 '13 16:01

Srujan Kumar Gulla


People also ask

Can DBA's accidentally change the number of rows in a table?

Help Wanted! One of the most common problems (prior to Oracle 10g) was that Oracle DBA's would accidentally change the execution behavior of their SQL by changing the num_rows for a table. Sometimes a DBA wants to know the row count, and the data dictionary view dba_tables has a column named num_rows.

How to count rows in a table in SQL?

Row count at SQL execution time: The "real" current row count, which requires that you actually issue SQL to count the rows in all of the tables (time consuming). To count all of the rows in real time, a simple SQL*Plus script will suffice:

Does DBMS_STATS contain the number of rows in the table?

So it does not contain the current number of rows in the table but an approximation calculated the last time DBMS_STATS ran.

How do I Count rows in a schema?

Row count at SQL execution time: The "real" current row count, which requires that you actually issue SQL to count the rows in all of the tables (time consuming). To count all of the rows in real time, a simple SQL*Plus script will suffice: Here is a PL/SQL approach for counting all tables in a schema. Here, you first vonnect as the schema owner:


1 Answers

According to the documentation NUM_ROWS is the "Number of rows in the table", so I can see how this might be confusing. There, however, is a major difference between these two methods.

This query selects the number of rows in MY_TABLE from a system view. This is data that Oracle has previously collected and stored.

select num_rows from all_tables where table_name = 'MY_TABLE' 

This query counts the current number of rows in MY_TABLE

select count(*) from my_table 

By definition they are difference pieces of data. There are two additional pieces of information you need about NUM_ROWS.

  1. In the documentation there's an asterisk by the column name, which leads to this note:

    Columns marked with an asterisk (*) are populated only if you collect statistics on the table with the ANALYZE statement or the DBMS_STATS package.

    This means that unless you have gathered statistics on the table then this column will not have any data.

  2. Statistics gathered in 11g+ with the default estimate_percent, or with a 100% estimate, will return an accurate number for that point in time. But statistics gathered before 11g, or with a custom estimate_percent less than 100%, uses dynamic sampling and may be incorrect. If you gather 99.999% a single row may be missed, which in turn means that the answer you get is incorrect.

If your table is never updated then it is certainly possible to use ALL_TABLES.NUM_ROWS to find out the number of rows in a table. However, and it's a big however, if any process inserts or deletes rows from your table it will be at best a good approximation and depending on whether your database gathers statistics automatically could be horribly wrong.

Generally speaking, it is always better to actually count the number of rows in the table rather then relying on the system tables.

like image 197
Ben Avatar answered Sep 21 '22 03:09

Ben