Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ABAP select performance hints?

Are there general ABAP-specific tips related to performance of big SELECT queries?

In particular, is it possible to close once and for all the question of FOR ALL ENTRIES IN vs JOIN?

like image 400
ilya n. Avatar asked Dec 28 '09 00:12

ilya n.


4 Answers

A few (more or less) ABAP-specific hints:

Avoid SELECT * where it's not needed, try to select only the fields that are required. Reason: Every value might be mapped several times during the process (DB Disk --> DB Memory --> Network --> DB Driver --> ABAP internal). It's easy to save the CPU cycles if you don't need the fields anyway. Be very careful if you SELECT * a table that contains BLOB fields like STRING, this can totally kill your DB performance because the blob contents are usually stored on different pages.

Don't SELECT ... ENDSELECT for small to medium result sets, use SELECT ... INTO TABLE instead. Reason: SELECT ... INTO TABLE performs a single fetch and doesn't keep the cursor open while SELECT ... ENDSELECT will typically fetch a single row for every loop iteration.

This was a kind of urban myth - there is no performance degradation for using SELECT as a loop statement. However, this will keep an open cursor during the loop which can lead to unwanted (but not strictly performance-related) effects.

For large result sets, use a cursor and an internal table. Reason: Same as above, and you'll avoid eating up too much heap space.

Don't ORDER BY, use SORT instead. Reason: Better scalability of the application server.

Be careful with nested SELECT statements. While they can be very handy for small 'inner result sets', they are a huge performance hog if the nested query returns a large result set.

Measure, Measure, Measure Never assume anything if you're worried about performance. Create a representative set of test data and run tests for different implementations. Learn how to use ST05 and SAT.

There won't be a way to close your second question "once and for all". First of all, FOR ALL ENTRIES IN 'joins' a database table and an internal (memory) table while JOIN only operates on database tables. Since the database knows nothing about the internal ABAP memory, the FOR ALL ENTRIES IN statement will be transformed to a set of WHERE statements - just try and use the ST05 to trace this. Second, you can't add values from the second table when using FOR ALL ENTRIES IN. Third, be aware that FOR ALL ENTRIES IN always implies DISTINCT. There are a few other pitfalls - be sure to consult the on-line ABAP reference, they are all listed there.

If the number of records in the second table is small, both statements should be more or less equal in performance - the database optimizer should just preselect all values from the second table and use a smart joining algorithm to filter through the first table. My recommendation: Use whatever feels good, don't try to tweak your code to illegibility.

If the number of records in the second table exceeds a certain value, Bad Things [TM] happen with FOR ALL ENTRIES IN - the contents of the table are split into multiple sets, then the query is transformed (see above) and re-run for each set.

like image 131
vwegert Avatar answered Oct 25 '22 20:10

vwegert


Another note: The "Avoid SELECT *" statement is true in general, but I can tell you where it is false.
When you are going to take most of the fields anyway, and where you have several queries (in the same program, or different programs that are likely to be run around the same time) which take most of the fields, especially if they are different fields that are missing.

This is because the App Server Data buffers are based on the select query signature. If you make sure to use the same query, then you can ensure that the buffer can be used instead of hitting the database again. In this case, SELECT * is better than selecting 90% of the fields, because you make it much more likely that the buffer will be used.

Also note that as of the last version I tested, the ABAP DB layer wasn't smart enough to recognize SELECT A, B as being the same as SELECT B, A, which means you should always put the fields you take in the same order (preferable the table order) in order to make sure again that the data buffer on the application is being well used.

like image 36
Noah Avatar answered Oct 25 '22 20:10

Noah


I usually follow the rules stated in this pdf from SAP: "Efficient Database Programming with ABAP" It shows a lot of tips in optimizing queries.

like image 3
franblay Avatar answered Oct 25 '22 19:10

franblay


This question will never be completely answered.

ABAP statement for accessing database is interpreted several times by different components of whole system (SAP and DB). Behavior of each component depends from component itself, its version and settings. Main part of interpretation is done in DB adapter on SAP side.

The only viable approach for reaching maximum performance is measurement on particular system (SAP version and DB vendor and version).

like image 2
user109758 Avatar answered Oct 25 '22 20:10

user109758