I am building a database for microarray data. Each patient sample has over 1,000,000 features and I would like to store the patient samples as rows in an SQL table with each feature as a column.
HuEX Microarray Data
+----+----------+----------+-----+------------------+
| ID | Feature1 | Feature2 | ... | Feature1,000,000 |
+----+----------+----------+-----+------------------+
| 1 | 2.3543 | 10.5454 | ... | 5.34333 |
| 2 | 13.4312 | 1.3432 | ... | 40.23422 |
+----+----------+----------+-----+------------------+
I know most relational database systems have limits on the number of columns in a table.
+------------+-----------------+
| DBMS | Max Table Col # |
+------------+-----------------+
| SQL Server | 1,024 - 30,000 |
| MySQL | 65,535 bytes |
| PostgreSQL | 250 - 1,600 |
| Oracle | 1,000 |
+------------+-----------------+
Obviously these limitations are too low for my task. Is there anyway to increase the number of columns an SQL database table can have or is there another DBMS that can handle such high number of table columns?
Update
Note all the columns will have values for all the rows.
Don't.
Event if you can make it work, it will be very slow and unwieldly.
Instead, you should make a separate table with columns for PatientID
, Feature
, and Value
.
This table would have one row for each cell in your proposed table.
It also makes it possible to add additional information about each patient-feature pair.
You'd normally split (normalize) the tables:
Sample: ID, PatientID
Feature: ID, Name
SampleFeature: SampleID, FeatureID, value
SQL databases can't handle a lot of columns, but they can handle a lot of rows.
Try rearranging your table to:
CREATE TABLE MicroarrayData (
SampleID INTEGER,
FeatureID INTEGER,
Value REAL,
PRIMARY KEY (SampleID, FeatureID)
);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With