Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Maximum (usable) number of rows in a Postgresql table

Tags:

postgresql

I realize that, per Pg docs (http://www.postgresql.org/about/), one can store an unlimited number of rows in a table. However, what is the "rule of thumb" for usable number of rows, if any?

Background: I want to store daily readings for a couple of decades for 13 million cells. That works out to 13 M * (366|365) * 20 ~ 9.5e10, or 95 B rows (in reality, around 120 B rows).

So, using table partitioning, I set up a master table, and then inherited tables by year. That divvies up the rows to ~ 5.2 B rows per table.

Each row is 9 SMALLINTs, and two INTs, so, 26 bytes. Add to that, the Pg overhead of 23 bytes per row, and we get 49 bytes per row. So, each table, without any PK or any other index, will weigh in at ~ 0.25 TB.

For starters, I have created only a subset of the above data, that is, only for about 250,000 cells. I have to do a bunch of tuning (create proper indexes, etc.), but the performance is really terrible right now. Besides, every time I need to add more data, I will have to drop the keys and the recreate them. The saving grace is that once everything is loaded, it will be a readonly database.

Any suggestions? Any other strategy for partitioning?

like image 999
punkish Avatar asked Jun 28 '10 12:06

punkish


People also ask

Can Postgres handle billions of rows?

As commercial database vendors are bragging about their capabilities we decided to push PostgreSQL to the next level and exceed 1 billion rows per second to show what we can do with Open Source. To those who need even more: 1 billion rows is by far not the limit - a lot more is possible. Watch and see how we did it.

Can Postgres handle millions of records?

If you're simply filtering the data and data fits in memory, Postgres is capable of parsing roughly 5-10 million rows per second (assuming some reasonable row size of say 100 bytes). If you're aggregating then you're at about 1-2 million rows per second.

What is limit in PostgreSQL?

The PostgreSQL LIMIT clause is used to get a subset of rows generated by a query. It is an optional clause of the SELECT statement. The LIMIT clause can be used with the OFFSET clause to skip a specific number of rows before returning the query for the LIMIT clause.

Is Postgres suitable for big data?

PostgreSQL is well known as the most advanced opensource database, and it helps you to manage your data no matter how big, small or different the dataset is, so you can use it to manage or analyze your big data, and of course, there are several ways to make this possible, e.g Apache Spark.


1 Answers

It's not just "a bunch of tuning (indexes etc.)". This is crucial and a must do.

You posted few details, but let's try.

The rule is: Try and find the most common working set. See if it fits in RAM. Optimize hardware, PG/OS buffer settings and PG indexes/clustering for it. Otherwise look for aggregates, or if it's not acceptable and you need fully random access, think what hardware could scan the whole table for you in reasonable time.

How large is your table (in gigabytes)? How does it compare to total RAM? What are your PG settings, including shared_buffers and effective_cache_size? Is this a dedicated server? If you have a 250-gig table and about 10 GB of RAM, it means you can only fit 4% of the table.

Are there any columns which are commonly used for filtering, such as state or date? Can you identify the working set that is most commonly used (like only last month)? If so, consider partitioning or clustering on these columns, and definitely index them. Basically, you're trying to make sure that as much of the working set as possible fits in RAM.

Avoid scanning the table at all costs if it does not fit in RAM. If you really need absolutely random access, the only way it could be usable is really sophisticated hardware. You would need a persistent storage/RAM configuration which can read 250 GB in reasonable time.

like image 58
Konrad Garus Avatar answered Oct 19 '22 04:10

Konrad Garus