Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MongoDB: Schema design for stock tick database

I need to store daily stock closing prices as well as tick data in MongoDB. How would you design such a schema? For daily prices I would be tempted to have one document for each stock symbol, e.g.

{
    symbol: "AAPL",
    quotes: {
        {
           date: '2014-01-01',
           values: { open: 1, high: 1, low: 1, close: 1, volume: 100 }
        },
        {
           date: '2014-01-02',
           values: { open: 1, high: 1, low: 1, close: 1, volume: 100 }
        }, ...
    }
}

For tick data I could do something like the above with one subdocument per hour with an array of ticks.

However, considering the maximum document size is only 16MB I believe the limited would be reached very fast, especially for tick data.

I am aware of this approach http://blog.mongodb.org/post/65517193370/schema-design-for-time-series-data-in-mongodb. Would that be a good approach? I.e. one document per symbol per day?

So, how would you design the schema for daily prices and tick data, respectively?

like image 874
Morten Avatar asked Apr 21 '14 13:04

Morten


People also ask

What is schema design in MongoDB?

A schema is what defines the structure and contents of your data in visual formats that make it easy for developers and data engineers to keep track of information. Using MongoDB schema designer tools, you can: Maintain data integrity. Store and execute queries efficiency. Know relationships between documents.

Does MongoDB database have schema?

Data in MongoDB has a flexible schema. Collections do not enforce document structure by default. This flexibility gives you data-modeling choices to match your application and its performance requirements.

How does MongoDB store time series data?

Time Series Data in MongoDB You can create a new time series collection with the createCollection() command. When you want to create a time series collection, you must include the timeField option. timeField indicates the name of the field that includes the date in each document.


1 Answers

I think you are on the right track.

  • Having one document for each stock symbol will give you a good overview of all the symbols in the collection. And each document will have a fairly maintainable size.
  • In my opinion, if you are even close to 16MB on a single document, the schema-design is far from good enough. It's not easily readable or maintainable. You also have to fetch a whole lot of data each time you want anything from the document.
  • You mention "one docuement per symbol per day". To me that sounds like a sensible way to structure the data. Although i'm not familiar with the details in tick data from stocks, I supposed this will give you a good foundation for the schema design. You split it by each day, and can easily get all ticks for a given day/hour.
  • Remember, there is no absolute solution to schema-design, as long as you think through it thoroughly. (there is definitely a right/wrong way though) ;)
like image 131
aludvigsen Avatar answered Dec 31 '22 18:12

aludvigsen