Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

FORALL vs FOR bulk update

This is asked in an interview. The question is, which one is faster, FORALL on 5000 or using FOR on 500 records? Well, I think that indeed FORALL is faster as far as processing speed is concerned but still it depends on number of rows to be processed, like in the case of above question. Please share your thoughts.

like image 874
Dpanshu Avatar asked Dec 11 '22 22:12

Dpanshu


1 Answers

It depends.

First, how is the test set up? In normal code, your for loop runs a query and does something so if you're measuring the performance of the loop, you're combining the time required to run the query, fetch the results, and do something with the results.

for x in (<<some select statement>>)
loop
  <<do something with x>>
end loop;

A forall on the other hand, implies that you've already done the work of fetching the data you need to process into a local collection.

forall i in 1..l_collection.count
  <<do something>>

A performance comparison that includes the time required to run the query and fetch the results in one case and excludes that time from the other case is not a fair comparison. You'd either need to include the cost of populating the collection in the forall case or you would need the for loop to iterate over an already populated collection for the comparison to be a fair one. Which approach you use may have a dramatic affect on the result.

A forall eliminates context shifts between the SQL and PL/SQL engines, that's it. That can be very useful if a substantial fraction of the program's runtime is spent on context shifts. The more expensive the thing you're doing with the data is (and the more expensive fetching the data is if you're including that time), the smaller the fraction of time spent on context shifts must be. If you're excluding the time required to fetch the data and the guts of your loop is a trivial operation, the for loop solution may well be spending more than 90% of its time doing context shifts so the forall solution may be 10 times faster. If you're including the time required to fetch the data, that time is a large fraction of the overall runtime, and the guts of your loop is an expensive operation, the for loop solution is probably spending a small fraction of its time doing context shifts so going to a forall won't get you nearly a 10x improvement.

Things get even more complicated if you are including the time required to populate the collection since there are many ways to potentially write that that will have different performance characteristics depending upon whether you're using implicit or explicit cursors and the Oracle version. That's something that is discussed in more detail in this question on the performance of a bulk collect.

like image 60
Justin Cave Avatar answered Dec 19 '22 04:12

Justin Cave