I'm new to oracle and I have to fight against this problem.
I have a table with approximately 520 millions of rows inside. I have to fetch all rows and import them (denormalizing) inside a NoSQL db.
The table has two integer fields C_ID and A_ID and 3 indexes, one over C_ID, one over A_ID and one on both fields.
I've tried this way at the beginning:
SELECT C_ID, A_ID FROM M_TABLE;
and this has never given to me any result in reasonable time (I had no possibility to measure the time because it seemed to never complete).
I changed the query in this way:
SELECT /*+ ALL_ROWS */ C_ID, A_ID FROM (SELECT
rownum rn, C_ID, A_ID
FROM
M_TABLE WHERE rownum < ((:1 * :2 ) +1 )) WHERE rn >= (((:1 -1) * :2 ) +1 );
I run this query in parallel using 3 threads and paginating using pages with size 1000.
I tried to introduce three optimization:
1) I created statistics over the table:
ANALYZE TABLE TABLE_M ESTIMATE STATISTICS SAMPLE 5 PERCENT;
2) I partitioned the table in 8 partitions.
3) I created the table with parallel option.
Now I am able to fetch 10000 rows per second and so the whole process takes about 15 hours to complete (the DB is running on a 4 cores, 8 GB machine).
The problem is that I need to complete all in maximum 5 hours.
I am out of ideas and so, before I ask for a new machine, you know any way to improve performance in such a situation.
Steps to take to improve performance of queries: - Create all primary and foreign keys and relationships among tables. - Avoid using Select*, rather mention the needed columns and narrow the resultset as needed. - Implement queries as stored procedures. - Have a WHERE Clause in all SELECT queries.
What do you do with your result? Is it fetched directly to a file with PL/SQL or do you use another application to process the data? Is it sent accross the network? (this might be the low hanging fruit).
The reason I ask is that usually a FULL SCAN
(without an ORDER BY) will return the first rows instantly. If you're outputting the result to a file you should see it start to fill up immediately. If you do not, this means that there is a lot of empty space at the beginning of the segment, which could explain why the query never returns (in a reasonable time at least).
So when you say that your query doesn't return I'm a bit concerned, how can you tell? Does the following block returns?
DECLARE
l NUMBER := 0;
BEGIN
FOR cc IN (SELECT C_ID, A_ID FROM M_TABLE) LOOP
l := l + 1;
EXIT WHEN l >= 100000;
END LOOP;
END;
If it does it means that your FULL SCAN is being processed. By timing the above query you should be able to calculate how much time would be needed for a complete single SCAN, assuming the segment is uniformly dense.
Reading 500M rows is a lot of work but the rows are tiny so if the table segment is well compacted Oracle should return all rows in a reasonable time. Table segments can have inefficient space configuration if repeatedly deleted then loaded with INSERT /*+APPEND*/
for example. Rebuilding the table (ALTER TABLE MOVE
) will remove all empty useless space in the segment. By the way when you partitioned the table you did rebuild it, so this may be the reason why your query now returns !!
In any case I would advise you to retry the FULL TABLE SCAN, possibly after having rebuilt the table to reset any empty space and the high water mark. A single FULL TABLE SCAN is by far the most reliable method (and one of the most efficient) to access lots of data.
If you need to further improve performance, I suggest you take a look at ROWID partitionning (DIY parallel processing scheme) or the built-in package DBMS_PARALLEL_EXECUTE
.
Oracle is pretty intelligent in telling us where it had spent its time. You can do this by tracing your session using Oracle's extended SQL trace (in other words 10046 trace). Your query is extracting data from one table that has lot of data. Check your IO rate (db_file_scattered_read) which is probably one of the top wait events of your query.
Hope it helps.
It might be a bit of a drastic solution to try but you could look at table compression. In Oracle 10g this is only really useful for read-only tables since the block are uncompressed when write operations are done. I've found compression to be useful for large tables in a data warehousing environment.
It is also possible to just compress certain partitions so it you are adding data to the end of a table that is partitioned by date, you could compress historical partitions while leaving the most recent one uncompressed.
The advantage of table compression is that it reduces the amount of I/O required which could help on an I/O constrained system. I was often getting 10:1 compression out of tables although it depends on what is stored in the table and the sorting used when inserting data.
For an existing table I think you can use the command:
ALTER TABLE M_TABLE COMPRESS MOVE;
Note that this may help to solve your problem but changing the underlying structure of the tables might be a little drastic. Also, rebuilding the table as compressed can invalidate some of the indexes.
Under Oracle 11g you can also you advanced compression which allows updates to the data but this involves expensive licensing costs.
There is some documentation here and a lot more information in this PDF document
Yes as said by user2033072 you should use SQL Trace
and TkProf
to know a bit more about the query. You could see the official documentation.
Also, more simply you could use explain plan
, that way Oracle will show what it is planning to do.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With