Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Which embedded database capable of 100 million records has an efficient C or C++ API

I'm looking for a cross-platform database engine that can handle databases up hundreds of millions of records without severe degradation in query performance. It needs to have a C or C++ API which will allow easy, fast construction of records and parsing returned data.

Highly discouraged are products where data has to be translated to and from strings just to get it into the database. The technical users storing things like IP addresses don't want or need this overhead. This is a very important criteria so if you're going to refer to products, please be explicit about how they offer such a direct API. Not wishing to be rude, but I can use Google - please assume I've found most mainstream products and I'm asking because it's often hard to work out just what direct API they offer, rather than just a C wrapper around SQL.

It does not need to be an RDBMS - a simple ISAM record-oriented approach would be sufficient.

Whilst the primary need is for a single-user database, expansion to some kind of shared file or server operations is likely for future use.

Access to source code, either open source or via licensing, is highly desirable if the database comes from a small company. It must not be GPL or LGPL.

like image 381
Andy Dent Avatar asked Feb 08 '09 02:02

Andy Dent


People also ask

Which database is best for large data?

MongoDB is also considered to be the best database for large amounts of text and the best database for large data.

Which database is best for C?

Oracle. Oracle is the most popular RDBMS written in assembly language C, C++, and Java.

Is MySQL the best database?

1) Best Databases for 2021: MySQL According to Stack Overflow survey 2020, MySQL is used by 55.6% of the respondents, making it the most widely used database in the world.


3 Answers

you might consider C-Tree by FairCom - tell 'em I sent you ;-)

like image 52
Steven A. Lowe Avatar answered Oct 11 '22 11:10

Steven A. Lowe


i'm the author of hamsterdb.

tokyo cabinet and berkeleydb should work fine. hamsterdb definitely will work. It's a plain C API, open source, platform independent, very fast and tested with databases up to several hundreds of GB and hundreds of million items.

If you are willing to evaluate and need support then drop me a mail (contact form on hamsterdb.com) - i will help as good as i can!

bye Christoph

like image 34
cruppstahl Avatar answered Oct 11 '22 12:10

cruppstahl


You didn't mention what platform you are on, but if Windows only is OK, take a look at the Extensible Storage Engine (previously known as Jet Blue), the embedded ISAM table engine included in Windows 2000 and later. It's used for Active Directory, Exchange, and other internal components, optimized for a small number of large tables.

It has a C interface and supports binary data types natively. It supports indexes, transactions and uses a log to ensure atomicity and durability. There is no query language; you have to work with the tables and indexes directly yourself.

ESE doesn't like to open files over a network, and doesn't support sharing a database through file sharing. You're going to be hard pressed to find any database engine that supports sharing through file sharing. The Access Jet database engine (AKA Jet Red, totally separate code base) is the only one I know of, and it's notorious for corrupting files over the network, especially if they're large (>100 MB).

Whatever engine you use, you'll most likely have to implement the shared usage functions yourself in your own network server process or use a discrete database engine.

like image 20
Chris Smith Avatar answered Oct 11 '22 13:10

Chris Smith