What is the best way to distribute postgresql

Tags:

I have a database in postgresql for a software as service with hundreds of customers, currently have a schema of postgresql for each customer, but i like a best solution because the customers rapidly increase. I read about cassandra but i don't wanna lose the integrity of primary,foregin keys and checks. Also read about postgresql in distributed systems, but i dont know what is the best way for implement this currently

988

asked May 01 '12 16:05

richie-torres

2 Answers

i don't wanna lose the integrity of primary,foregin keys and checks

The point of systems like Cassandra is, once your dataset or workload doesn't fit on a single machine, you have to give up those things even if you stay on postgresql. (I covered the details in a talk that I highly recommend: http://blip.tv/pycon-us-videos-2009-2010-2011/pycon-2010-what-every-developer-should-know-about-database-scalability-21-3280648).

So Cassandra is an answer to the question, "If we know we're going to have to give up foreign keys and joins, what can we build by rethinking how we design our database?"

If you never get to that point, Cassandra is overkill. (But you should still watch that talk. :)

answered Sep 20 '22 18:09

jbellis

There are four levels at which you can separate your customers:

Run a separate PostgreSQL cluster for each customer. This provides maximum separation; each client is on a separate port with its own set of system tables, transaction log, etc.
Put each customer in a separate database in the same cluster. This way they each have a separate login, but on the same port number, and they share global tables like pg_database.
Give each customer a separate schema in the same database. This doesn't require separate user IDs if they are only connecting through your software, because you can just set the search_path. Of course you can use separate user IDs if you prefer.
Make customer_id part of the primary key of each table, and be sure to limit by that in your software. This is likely to scale better than having duplicate tables for each of hundreds of users, but you must be very careful to always qualify your queries by customer_id.

Some people have been known to combine these techniques, for example, limiting each cluster to 100 databases with a separate database for each customer.

Without more detail it's hard to know which configuration will be best for your situation, except to say that if you want to allow users direct access to the database, without going through your software, you need to think about what is visible in system tables with each option. Look at pg_database, pg_user, and pg_class from a user perspective, to see what is exposed.

112

answered Sep 20 '22 18:09

kgrittn

Related questions
                            
                                SQLite in development, PostgreSQL in production—why not?
                            
                                Column 'mary' does not exist
                            
                                How to display "invisible" unicode characters in psql / postgres?
                            
                                Slow performance after upgrading PostreSQL from 9.1 to 9.4
                            
                                Why this query is not using index only scan in postgresql
                            
                                Execute multiple functions together without losing performance
                            
                                How can I store HTML code in a Postgresql DB table field?
                            
                                Generate test data in PostgreSQL table
                            
                                How to index a string array column for pg_trgm `'term' % ANY (array_column)` query?
                            
                                Postgresql IN operator Performance: List vs Subquery
                            
                                Postgresql jsonb_agg subquery sort
                            
                                django.db.utils.OperationalError: FATAL: role "django" does not exist
                            
                                PostgreSQL: Aggregate multiple rows as JSON array based on specific column
                            
                                How to get the list of timezones supported by PostgreSQL?
                            
                                How to cast bigint to timestamp with time zone in postgres in an update
                            
                                psycopg2 - Inserting list of dictionaries into PosgreSQL database. Too many executions?
                            
                                Automatic partitioning by day - PostgreSQL
                            
                                psycopg2.InternalError: how can I get more useful information?
                            
                                SQL: select N “most recent” rows in ascending order
                            
                                PostgreSQL aggregate or window function to return just the last value

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is the best way to distribute postgresql

Tags:

postgresql

cassandra

distributed

richie-torres

People also ask

2 Answers

jbellis

kgrittn

Recent Activity

Donate For Us