A stream of stock updates are coming in the form of ticker, buy/sell pair (example YHOO,300) from the securities exchange(at a very high rate). We have to always display the top 5 most traded stock by volume at any given point of time. How would you maintain these values in memory?
My idea is to use a hashmap(Ticker->Volume) to update our datastore from the feed O(1).And then create a Treemap based on the volume to display the top5 stocks O(1). But what I'm not able to come up is an efficient way to sync the Treemap data as the values in the hashmap change (very frequently) . Any suggestions?
If the number of trades is an always increasing cumulative value (which it seems to be), you can just maintain:
When a new update arrives, you update your hashmap in O(1) and, if the new number of trades is big enough to enter the top 5, you update also your array. Checking the array and updating it can be done in O(1). You can also use binary search if you don't want to compare the new volume with all 5 values.
You can't do much better than checking an average of log(5) values.
If the number of trades in not increasingly monotone (which seems unlikely, but can happen if you want to consider the number of trades done in a bounded interval of time), you can also maintain:
When an update arrives you will:
The total complexity is O(log n) where "n" is the number of tickers.
Even if you need the top 5 tickers, when the volume of one ticker goes down, you need to find the 6th element that will become 5th, and a 7th element .. and so on. I don't think you can go much lower than O(log n) in the complexity if the number of trades is not monotone.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With