Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Postgres Out of Memory

Tags:

postgresql

I am setting up a new install of postgres 10.4 and during load testing I keep receiving an out of memory error.

Configuration:

 $ cat /etc/security/limits.conf
 postgres   hard    memlock 508559360
 postgres   soft    memlock 508559360


$ cat /etc/sysctl.conf
vm.nr_hugepages = 248320
vm.hugetlb_shm_group = 118
vm.overcommit_memory=2
vm.swappiness=1
vm.vfs_cache_pressure=50
kernel.sem = 250 32000 32 128

System Specs:

$ cat /proc/cpuinfo | grep "core id" | wc -l
32

Each CPU is an Intel(R) Xeon(R) CPU E5-2695 v4 @ 2.10GHz. 16 physical cores, 32 logical as shown above.

$ free -m
free -m
             total        used        free      shared  buff/cache   available
Mem:         510969       498048      2434      3098    10485        8343
Swap:        1071         1071        0

Note: we have the memlock set at approximately 485 GB dedicated directly to postgres which shows in the high "used" column.

$ df -h /dev/shm
Filesystem      Size  Used Avail Use% Mounted on
tmpfs           250G   64K  250G   1% /dev/shm

$ cat /proc/meminfo 
MemTotal:       523232496 kB
MemFree:         1216648 kB
MemAvailable:    8348808 kB
Buffers:           65220 kB
Cached:         10211780 kB
SwapCached:         7620 kB
Active:          5556092 kB
Inactive:        5432428 kB
Active(anon):    2208064 kB
Inactive(anon):  1593084 kB
Active(file):    3348028 kB
Inactive(file):  3839344 kB
Unevictable:       24128 kB
Mlocked:           24128 kB
SwapTotal:       1097724 kB
SwapFree:              0 kB
Dirty:                52 kB
Writeback:             0 kB
AnonPages:        728044 kB
Mapped:            62904 kB
Shmem:           3085592 kB
Slab:            1454276 kB
SReclaimable:    1345008 kB
SUnreclaim:       109268 kB
KernelStack:       11984 kB
PageTables:        22856 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    15770860 kB
Committed_AS:    7920136 kB
VmallocTotal:   34359738367 kB
VmallocUsed:           0 kB
VmallocChunk:          0 kB
HardwareCorrupted:     0 kB
AnonHugePages:     71680 kB
CmaTotal:              0 kB
CmaFree:               0 kB
HugePages_Total:   248320
HugePages_Free:    245883
HugePages_Rsvd:     1800
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:      329600 kB
DirectMap2M:    534444032 kB

Postgres Info:

SELECT version();
PostgreSQL 10.4 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609, 64-bit

max_connections=600   
shared_buffers= '8192 MB'
work_mem = '20 MB'
maintenance_work_mem = 2GB
max_parallel_workers = 8
wal_buffers = 16MB
max_wal_size = 20GB
min_wal_size = 1GB
checkpoint_completion_target = 0.9
effective_cache_size = '364 GB'
default_statistics_target = 1000
log_timezone = 'US/Eastern'
track_activities = on
track_counts = on
track_io_timing = on
stats_temp_directory = 'pg_stat_tmp'
datestyle = 'iso, mdy'
timezone = 'US/Eastern'
default_text_search_config = 'pg_catalog.english'
transform_null_equals = on
shared_preload_libraries = 'pg_stat_statements'
track_activity_query_size = 16384
track_functions = all
track_io_timing = true
pg_stat_statements.track = all
session_preload_libraries = 'auto_explain'
auto_explain.log_min_duration = '3s'
auto_explain.log_nested_statements='on'
auto_explain.log_analyze=true

Note: I've started w/ work memory set as high as 250 MB and have slowly brought it down to 20 MB and still receive the errors. I've also verified that connections don't go higher than 120 connections. We use PGBouncer in front of the instance in session mode.

The errors are too large for stack overflow but here they are linked:

https://codepad.co/snippet/derPU4E8

The standout errors are:

2018-06-18 15:02:22 EDT,28197,mydb,ERROR:  could not resize shared memory segment "/PostgreSQL.1552129380" to 192088 bytes: No space left on device

I don't understand what device its talking about. I don't have any that appear to be even close to OOM let alone the tiny 192088 bytes its talking about.

2018-06-18 15:02:22 EDT,19708,mydb,ERROR:  out of memory
2018-06-18 15:02:22 EDT,19708,mydb,DETAIL:  Failed on request of size 7232.


2018-06-18 15:02:22 EDT,16688,,LOG:  could not fork worker process: Cannot allocate memory
2018-06-18 15:02:22 EDT,4555,,ERROR:  out of memory
2018-06-18 15:02:22 EDT,4555,,DETAIL:  Failed on request of size 78336.
2018-06-18 15:02:22 EDT,4552,,LOG:  could not open directory "/usr/lib/postgresql/10/share/timezone": Cannot allocate memory
2018-06-18 15:02:22 EDT,19935,mydb,ERROR:  could not load library "/usr/lib/postgresql/10/lib/auto_explain.so": /usr/lib/postgresql/10/lib/auto_explain.so: failed to map segment from shared object


2018-06-18 15:02:22 EDT,28193,mydb,ERROR:  out of memory
2018-06-18 15:02:22 EDT,28193,mydb,DETAIL:  Failed on request of size 8192.
2018-06-18 15:02:22 EDT,26927,mydb,ERROR:  could not resize shared memory segment "/PostgreSQL.1101931262" to 192088 bytes: No space left on device

Question: How do I debug this issue and more importantly how can I resolve it?

like image 748
user3186332 Avatar asked Jun 18 '18 20:06

user3186332


People also ask

How do I free up space in PostgreSQL?

If you want to free up space on the file system, either VACUUM FULL or CLUSTER can help you. You may also want to run ANALYZE after these, to make sure the planner has up-to-date statistics but this is not specifically required.

Why is Postgres using so much memory?

PostgreSQL connections consume memory and CPU resources even when idle. As queries are run on a connection, memory gets allocated. This memory isn't completely freed up even when the connection goes idle.

How much RAM is needed for PostgreSQL?

Memory. The 2GB of memory is a recommendation for memory you can allocate to PostgreSQL outside of the operating system. If you have a small data set, you are still going to want enough memory to cache the majority of your hot data (you can use pg_buffercache to determine your hot data).

How do I flush the buffer cache in PostgreSQL?

In PostgreSQL, we do not have any predefined functionality to clear the cache from the memory. To clear the database level cache, we need to shut down the whole instance and to clear the operating system cache, we need to use the operating system utility commands.


1 Answers

I followed Laurenz advice and no longer run OOM.

vm.overcommit_ratio = 100.

Additionally, I removed the memlock and huge pages definition in sysctl/limits.

like image 143
user3186332 Avatar answered Oct 22 '22 09:10

user3186332