Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What data structure will optimzied to represent stock market?

Data for various stocks is coming from various stock exchange continuously. Which data structure is suitable to store these data?

things to consider are :

a) effective retrieval and update of data is required as stock data changes per second or microsecond during trading time.

I thought of using Heap as the number of stocks would be more or less constant and the most frequent used operations are retrieval and update so heap should perform well for this scenario.

b) need to show stocks which are currently trending (as in volume of shares being sold most active and least active, high profit and loss on a particular day)

I am nt sure about how to got about this.

c) as storing to database using any programming language has some latency considering the amount of stocks that will be traded during a particular time, how can u store all the transactional data persistently??

Ps: This is a interview question from Morgan Stanley.

like image 677
user1631651 Avatar asked Apr 18 '13 09:04

user1631651


2 Answers

A heap doesn't support efficient random access (i.e. look-up by index) nor getting the top k elements without removing elements (which is not desired).

My answer would be something like:

A database would be the preferred choice for this, as, with a proper table structure and indexing, all of the required operations can be done efficiently.

So I suppose this is more a theoretical question about understanding of data structures (related to in-memory storage, rather than persistent).

It seems multiple data structures is the way to go:

a) Effective retrieval and update of data is required as stock data changes per second or microsecond during trading time.

A map would make sense for this one. Hash-map or tree-map allows for fast look-up.

b) How to show stocks which are currently trending (as in volume of shares being sold most active and least active, high profit and loss on a particular day)?

Just about any sorted data structure seems to make sense here (with the above map having pointers to the correct node, or pointing to the same node). One for activity and one for profit.

I'd probably go with a sorted (double) linked-list. It takes minimal time to get the first or last n items. Since you have a pointer to the element through the map, updating takes as long as the map lookup plus the number of moves of that item required to get it sorted again (if any). If an item often moves many indices at once, a linked-list would not be a good option (in which case I'd probably go for a Binary Search Tree).

c) How can you store all the transactional data persistently?

I understand this question as - if the connection to the database is lost or the database goes down at any point, how do you ensure there is no data corruption? If this is not it, I would've asked for a rephrase.

Just about any database course should cover this.

As far as I remember - it has to do with creating another record, updating this record, and only setting the real pointer to this record once it has been fully updated. Before this you might also have to set a pointer to the old record so you can check if it's been deleted if something happens after setting the pointer away, but before deletion.

Another option is having a active transaction table which you add to when starting a transaction and remove from when a transaction completes (which also stores all required details to roll back or resume the transaction). Thus, whenever everything is okay again, you check this table and roll back or resume any transactions that have not yet completed.

like image 93
Bernhard Barker Avatar answered Nov 03 '22 01:11

Bernhard Barker


If I have to choose , I would go for Hash Table:

Reason : It is synchronized and thread safe , BigO(1) as average case complexity.

Provided : 1.Good hash function to avoid the collision. 2. High performance cache.

like image 34
Rahul Srivastava Avatar answered Nov 03 '22 01:11

Rahul Srivastava