Database schema for organizing historical stock data

Tags:

I'm creating a database schema for storing historical stock data. I currently have a schema as show below.

My requirements are to store "bar data" (date, open, high, low, close volume) for multiple stock symbols. Each symbol might also have multiple timeframes (e.g. Google Weekly bars and Google Daily bars).

My current schema puts the bulk of the data is in the OHLCV table. I'm far from a database expert and am curious if this is too naive. Constructive input is very welcome.

CREATE TABLE Exchange (exchange TEXT UNIQUE NOT NULL);  CREATE TABLE Symbol (symbol TEXT UNIQUE NOT NULL, exchangeID INTEGER NOT NULL);  CREATE TABLE Timeframe (timeframe TEXT NOT NULL, symbolID INTEGER NOT NULL);  CREATE TABLE OHLCV (date TEXT NOT NULL CHECK (date LIKE '____-__-__ __:__:__'),     open REAL NOT NULL,     high REAL NOT NULL,     low REAL NOT NULL,     close REAL NOT NULL,     volume INTEGER NOT NULL,     timeframeID INTEGER NOT NULL);

This means my queries currently go something like: Find the timeframeID for a given symbol/timeframe, then do a select on the OHLCV table where the timeframeID matches.

323

asked Oct 06 '09 04:10

nall

2 Answers

We tried to find a proper database structure for storing large amount of data for a long time. The solution below is the result of more than 6 years of experience. It is now working flawlessly for our quantitative analysis.

We have been able to store hundreds of gigabytes of intraday and daily data using this scheme in SQL Server:

 Symbol -  char 6  Date -  date  Time -  time  Open -  decimal 18, 4  High -  decimal 18, 4  Low -  decimal 18, 4  Close -  decimal 18, 4  Volume -  int

All trading instruments are stored in a single table. We also have a clustered index on symbol, date and time columns.

For daily data, we have a separate table and do not use the Time column. Volume datatype is also bigint instead of int.

The performance? We can get data out of the server in a matter of milliseconds. Remember, the database size is almost 1 terabyte.

We purchased all of our historical market data from the Kibot web site: http://www.kibot.com/

answered Oct 09 '22 01:10

boe100

Well, on the positive side, you have the good sense to ask for input first. That puts you ahead of 90% of people unfamiliar with database design.

There are no clear foreign key relationships. I take it timeframeID relates to symbolID?
It's unclear how you'd be able to find anything this way. Reading up on abovementioned foreign keys should improve your understanding tremendously with little effort.
You're storing timeframe data as TEXT. From a performance as well as a usability perspective, that's a no-no.
Your current scheme can't accommodate stock splits, which will happen eventually. It's better to add one further layer of indirection between the price data table and the Symbol
open, high, low, close prices are better stored as decimal or currency types, or, preferably, as an INTEGER field with a separate INTEGER field storing the divisor, as the smallest price fraction (cents, eights of a dollar, etc.) allowed varies per exchange.
Since you support multiple exchanges, you should support multiple currencies.

I apologise if all of this doesn't seem too 'constructive', especially since I'm too sleepy right now to suggest a more usable alternative. I hope the above is enough to set you on your way.

answered Oct 09 '22 00:10

Michiel Buddingh

Related questions
                            
                                How to insert text with single quotation sql server 2005
                            
                                SQL join with composite primary key
                            
                                How to insert new row to database with AUTO_INCREMENT column without specifying column names?
                            
                                INSERT INTO ... SELECT without detailing all columns
                            
                                MySQL/SQL retrieve first 40 characters of a text field?
                            
                                How to execute UNION without sorting? (SQL)
                            
                                How to combine GROUP BY, ORDER BY and HAVING
                            
                                How to retrieve JSON data from MySQL?
                            
                                Average of multiple columns
                            
                                Date / Timestamp to record when a record was added to the table? [duplicate]
                            
                                SQL Server Insert Example
                            
                                Does COUNT(*) always return a result?
                            
                                How can I retrieve the logical file name of the database from backup file
                            
                                How to use a TRIM function in SQL Server
                            
                                Athena greater than condition in date column
                            
                                Cannot create SSPI context
                            
                                Postgres: select all row with count of a field greater than 1
                            
                                Normalization in plain English
                            
                                What's the correct name for an "association table" (a many-to-many relationship) [closed]
                            
                                Entity Framework - attribute IN Clause usage

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Database schema for organizing historical stock data

Tags:

sql

sqlite

schema

stocks

nall

People also ask

2 Answers

boe100

Michiel Buddingh

Recent Activity

Donate For Us