I have an external system that publish real-time financial data(e.g. stock quote and price from exchange all over the world).
This external system has some limitation on number of stocks per account connection, as we have many applications need to consume these real-time streaming data so we don't want each application to connect to that external system and manage the capacity by themselves, hence we want to design a single system that do the consumption for all stocks and then publish to some message queue(e.g. kafka or pulsar), then the downstream application can consume from the kafka topics.
The problem is how we can design the topics, the number of stocks is around 10 millions, but each application is only interest in subset of them, the subset size can either be small or large, and different subset could share the same stocks.
What I can think is to dynamically create some streaming job(e.g. kafka streaming or a separate flink job to do a pre-aggregation to collect the interested stocks for each consumer from all topics and then publish to another topic for each consumer), in this way each consumer will have its own topic with only its interested stocks, but will definitely bring the overhead of message transportation time, duplicate message, and latency, besides that, the capacity might also be a problem if there are more and more consumers.
I don't know if there are any better way to achieve my requirements, please advice, thank you.
If I understand your requirements correctly, you have some real-time feed of stock prices which includes quotes for ALL securities on that exchange, i.e. APPL, IBM, and MSFT quotes are included in that single feed. Also, you don't want consumers directly attaching to this feed, so you need to store that information in an intermediate message system
In that case you may want to consider using Pulsar's key_shared subscription to pre-sort the data by ticker symbol. Each of these consumers could then publish their results to a ticker symbol specific topic. Clients would then need to subscribe to only those topics that they are interested in and consume a subset of that data.
All ticker symbols ----> (500 symbol-specific topics). <---- Client subscribes to subset of these.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With