Jumping from article to article, I can see everywhere the expression "bulk loading".
What does it really (technically) mean?
What does it imply?
Explanation based on use-cases is welcome.
Bulk loading rows instead of inserting them one at a time can dramatically increase the throughput of an ETL program. Bulk loading works by loading data from a temporary file into the database. The actual process of bulk loading is unfortunately different for each RDBMS.
Bulk load bypasses the data parsing that is usually done by the database, providing an additional performance gain over batch operations. This property allows existing applications with batch inserts to take advantage of bulk load without requiring changes to the application code.
The bulk loader, bulkload , is a bulk management tool. It takes input data in LDIF or SQL*Loader format and loads the data directly into Oracle Internet Directory's schema in the metadata repository. It has three main phases: check , generate and load .
Bulk data transfer is a software application feature that uses data compression, data blocking and buffering to optimize transfer rates when moving large data files.
Indexes are usually optimized for inserting rows one at a time. When you are adding a great deal of data at once, inserting rows one at a time may be inefficient. For instance, with a B-Tree, the optimal way to insert a single key is very poor way of adding a bunch of data to an empty index.
Instead you pursue a different strategy with B-Trees. You presort all of the data, and group it in blocks. You can then build a new B-Tree by transforming the blocks into tree nodes. Although both techniques have the same asymptotic performance, O(n log(n)), the bulk-load operation has much smaller factor.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With