I regularly see the expression 'incremental loading' when reading articles
What does is really (technically) mean? What does it implies ?
Explanations using use-cases are welcome.
Incremental load is a process of loading data incrementally. Only new and changed data is loaded to the destination. Data that didn't change will be left alone. Data integrity can be ensured in this process too, but ETL can get complicated.
Incremental Load is a fast technique that easily handles large datasets. On the other hand, a Full Load is an easy to set up approach for a relatively smaller dataset that guarantees a complete sync with fairly simple logic.
Incremental loading is used when moving data from one repository (Database) to another.
Non-incremental loading would be when the destination has the entire data from the source pushed to it.
Incremental would be only passing across the new and amended data.
A concrete example:
A company may have two platforms, one that processes orders, and a seperate accounting system. The accounts department enters new customer details into the accounting system but has to ensure these customers appear in the order processing system.
To do this it runs a nightly batch job that sends data from the accounting system to the order system.
If they were deleting all customer details in the order system and refilling with all the customers in the accounting system then they would be performing a non-incremental load.
If they only sent accross the new customers and the customers that had been changed they would be performing an incremental load.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With