Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Just what is 'A big database'? [closed]

Tags:

database

Ok, dumb question I know but I see the nebulous comment 'a large database' as well as small and medium and I wonder just what that means. Can someone define what a small, medium and large database is for us SQL neophytes?

like image 727
Randin Avatar asked Mar 15 '09 03:03

Randin


People also ask

What is a big database?

Big data databases store petabytes of unstructured, semi-structured and structured data without rigid schemas. They are mostly NoSQL (non-relational) databases built on a horizontal architecture, which enable quick and cost-effective processing of large volumes of big data as well as multiple concurrent queries.

What size is considered big data?

“Big data” is a term relative to the available computing and storage power on the market — so in 1999, one gigabyte (1 GB) was considered big data. Today, it may consist of petabytes (1,024 terabytes) or exabytes (1,024 petabytes) of information, including billions or even trillions of records from millions of people.

What is big data example?

What are examples of big data? Big data comes from myriad sources -- some examples are transaction processing systems, customer databases, documents, emails, medical records, internet clickstream logs, mobile apps and social networks.


2 Answers

There isn't a threshold where a small database becomes medium or a medium database becomes large. Generally, when I hear these terms, I think of particular orders of magnitude in terms of total records being stored.

  • Small: Fits in a spreadsheet.
  • Medium: Fits in memory on a commodity server.
  • Large: Fits in a commodity cloud offering.
  • Very large: Fits in a specialized environment; unusual storage, latency, or throughput characteristics.

As poster dkretz suggested, you could also think about it in terms of the properties each kind of database has. Categorizing it this way, I'd say:

  • Small: Performance is not a concern. Your queries run fine without making any special optimizations. You see only a marginal performance difference when using front-line enhancements like indexes.

  • Medium: Your database probably has one or more staff that are assigned part-time to its maintenance and care. These people pay attention to the database's health; their primary administrative responsibility is to prevent unacceptable performance problems and minimize downtime.

  • Large: Probably has dedicated staff member(s) whose job is to work on the database and improve performance, as well as make sure that application changes don't cause schema breakage over the lifetime of the database. Metrics about the health and status of the database are monitored closely. Significant expertise is required to understand and perform optimizations.

  • Very large: The database stores vast amounts of information that must be readily accessible. Performance optimizations are absolutely required to wring every last ounce of speed out of each queries, and without it, the database would be much less usable or even impossible to use. The database may be using sophisticated or innovative replication or clustering techniques, pushing the boundaries of current technology.

Note that these are entirely subjective, and that someone may very well have a perfectly legitimate alternate definition of "large".

like image 57
John Feminella Avatar answered Sep 21 '22 17:09

John Feminella


One way to figure it is by observing your test queries.

A small database is one where indexes don't matter.

A medium database is one where queries take longer than one second if you don't have an appropriate index in place.

A big database is one where queries often take hours to optimize, using a combination of query design, index modification, and many test cycles.

like image 21
dkretz Avatar answered Sep 21 '22 17:09

dkretz