Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

CQRS/ES: Bulk operations/imports

I'm trying to wrap my head around the whole CQRS/ES idea, and contemplating writing a proof of concept and technical specification of how to implement it in our current application.

The problematic operations (in terms of how to map them to CQRS/ES) are bulk-updating of complex article data through a file import -- single rows in data files expanding to article groups, articles, headers, units and properties, bulk-loading of files linking buyer assortments to supplier assortments and exporting parts of or entire assortments.

I've read somewhere (may have been the DDDCQRS Google Group) that the best way to model the article import BC (which reads Excel files or other grid files) would be to have a single line of imported data be an aggregate, and an entire import to be the aggregate root. That way, after parsing the file, all I would have to do is create an import aggregate, and for each line, add that line to the import. That would store events in the BC's event store, and publish events that the article management BC would subscribe to. Does this make sense?

In the current system, an import is run in a single, long-running transaction. Long-running should be read as between 5 and 40 minutes, depending on the amount of data imported and on the amount of data already present for a given user (because data is compared with previously imported files and current data). When halfway through the operation fails, currently the whole operation is rolled back. How does that work in CQRS/ES?

like image 407
ErikHeemskerk Avatar asked Jan 31 '13 21:01

ErikHeemskerk


1 Answers

Little todo with CQRS/ES. A very naive approach follows:

  • Find your units of work,
  • Devise an ascending identification scheme for these units,
  • Transform the original input into these units of work (less likely to fail & fast) and assign identity along the way,
  • Now process each unit of work as a transaction, updating a last processed unit of work identity as part of each transaction (or multiple if you intend to process in parallel),
  • Upon failure resume from the last processed unit of work onwards, either automatic or after ops has given the green light.

Whether there is an eventsourced or statebased model behind all that is inferior IMO.

like image 126
Yves Reynhout Avatar answered Oct 29 '22 02:10

Yves Reynhout