Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding postgreSQL shared memory

Tags:

sql

postgresql

I've watched the presentation and still have one question about working of shared buffers. As the slide 16 shows, when the server handles an incoming request, the postmaster process calls fork() to create a child one for handling the incoming request. Here is a picture from there: enter image description here

So, we have the entire copy of the postmaster process except its pid. Now, if the child process update some data belonging to shared memory (putting in shared buffers, as shown in the slide 17), we need the other threads be awared of the changes. The picture:

enter image description here

The synchronization process is what I don't understand. Any process owns a copy of the shared memory and while copying it doesn't know if another thread will write something to its copy of the shared memory. What if after creating proc1 by calling fork(), another process proc2 is created a little bit later and start writing something into the its copy of the shared memory.

Question: How does proc1 know what to do with the part of the shared memory that are being modified by proc2?

like image 698
St.Antario Avatar asked Oct 04 '15 06:10

St.Antario


People also ask

How much RAM does Postgres need?

The 2GB of memory is a recommendation for memory you can allocate to PostgreSQL outside of the operating system.

What is Postgres shared buffer?

shared_buffers. PostgreSQL uses 'double buffering', meaning that PostgreSQL uses its own internal buffer as well as kernel buffered IO. In short, this means that data is stored in memory twice. The PostgreSQL buffer is named shared_buffers and it defines how much dedicated system memory PostgreSQL will use for cache.

What are dirty pages in Postgres?

The shared buffers are accessed by all the background server and user processes connecting to the database. The data that is written or modified in this location is called "dirty data" and the unit of operation being database blocks (or pages), the modified blocks are also called "dirty blocks" or "dirty pages".

What happens when Postgres runs out of memory?

The most common cause of out of memory issue happens when PostgreSQL is unable to allocate the memory required for a query to run. This is defined by work_mem parameter, which sets the maximum amount of memory that can be used by a query operation before writing to temporary disk files.


2 Answers

The crucial thing to understand is that there are two different types of memory sharing used.

One is the copy-on-write sharing used by fork() (without exec()), where the child process inherits the parent process's memory and state. In this case when the child or parent modify anything, a new private copy of the modified memory page is allocated. So the child doesn't see changes made by the parent after fork() and the parent doesn't see changes made by the child after fork(). Peer children cannot see each other's changes either. They're all isolated as far as memory is concerned, they just share a common ancestor.

That memory is what's shown in the Program (text), data and stack sections of the diagram.

Because of that isoltion, PostgreSQL also uses POSIX shared memory - or, in older versions, system V shared memory. These are explicitly shared memory segments that are mapped to a range of addresses. Each process sees the same memory, and it is not copy-on-write. It's fully read/write shared.

This is what is shown in the purple "shared memory" section of the diagram.

POSIX shared memory is used for inter-process communication for locking, for shared_buffers, etc etc. Not the memory inherited from fork()ing.

While memory from fork is often shared copy-on-write, that's really an operating system implementation detail. The operating system could choose not to share it at all, and make an immediate copy of the parent's whole address space for the child at fork time. The only way the copy-on-write sharing is really relevant is when looking at top etc.

When PostgreSQL refers to "shared memory" it's always talking about the POSIX or System V shared memory block(s) that are mapped into each process's address space. Not copy-on-write sharing from fork().

like image 125
Craig Ringer Avatar answered Oct 13 '22 19:10

Craig Ringer


I don't know about this special case but generally in linux and most other operating systems in order to speedup creating a new process, when a process asks operating system to create a new process then OS creates the new one with minimum requirements (specifically in DB applications) and share most of parent memory space with child. Now when a child want to modify some part of shared memory, OS uses COW (copy on write) concept and create a new copy of that part of the memory for child process usage. So this part becomes specific for child process and is no longer shared with parent process.

like image 28
Mahmoud Avatar answered Oct 13 '22 20:10

Mahmoud