0

I have an application that loads time series data from various files. The application opens one thread per file to load the data in parallel. The records in the files are ordered but I need to deliver one feed to the rest of the application maintaining the order of events overall.

Can this be implemented using the disruptor like multiple producers one consumer type of design maintaining the order of events?

I am currently using blocking collections and a sorted list to sort the head of each of the blocking collections but this consumes a ton of memory and I am interested to see if someone else has implemented a similar design using a different architecture.

Thanks

1 Answer 1

0

If you redesign to something like object streams (focus on the stream), then loading from the file should only load a minimum in memory (which ever buffer size you need.) Each stream prefetch 1 head item.

Then you have to implement a k-way merge to pick the lowest of N items. You would place the streams in a binary tree. When popping the lowest value, the stream relocates in the tree (swaps and rotations). It's around O(log n) of course, to pop a value. When a stream dry, remove from tree.

It's a generalization of a 2 sorted array merge; you have to resort the arrays by their head, which is quite different than sorting a random set; you have a nearly ordered set except that 1 stream out of place. You could do a binary search but that re-insert would be expensive in mem copies. Tree rotations are simpler.

(and disruptor has nothing to do with this...lol)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.