Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the correct way to get a consistent snapshot of /proc/pid/smaps?

Tags:

linux

procfs

I am trying to parse the PSS value from /proc/<pid>/smaps of a process in my C++ binary.

According to this SO answer, naively reading the /proc/<pid>/smaps file for example with ifstream::getLine() will result in an inconsistent dataset. The solution suggested is to use the read() system call to read the whole data in one go, something like:

#include <unistd.h>
#include <fcntl.h>

...

char rawData[102400];
int file = open("/proc/12345/smaps", O_RDONLY, 0);

auto bytesRead = read(file, rawData, 102400); // this returns 3722 instead of expected ~64k
close(file);

std::cout << bytesRead << std::endl; 

// do some parsing here after null-terminating the buffer

...

My problem now is that despite me using a 100kB buffer, only 3722 bytes are returned. Looking at what cat does when parsing the file using strace, I see that it is using multiple calls to read() (also getting around 3k bytes on every read) until read() returns 0 - as described in the documentation of read():

...
read(3, "7fa8db3d7000-7fa8db3d8000 r--p 0"..., 131072) = 3588
write(1, "7fa8db3d7000-7fa8db3d8000 r--p 0"..., 3588) = 3588
read(3, "7fa8db3df000-7fa8db3e0000 r--p 0"..., 131072) = 3632
write(1, "7fa8db3df000-7fa8db3e0000 r--p 0"..., 3632) = 3632
read(3, "7fa8db3e8000-7fa8db3ed000 r--s 0"..., 131072) = 3603
write(1, "7fa8db3e8000-7fa8db3ed000 r--s 0"..., 3603) = 3603
read(3, "7fa8db41d000-7fa8db425000 r--p 0"..., 131072) = 3445
write(1, "7fa8db41d000-7fa8db425000 r--p 0"..., 3445) = 3445
read(3, "7fff05467000-7fff05496000 rw-p 0"..., 131072) = 2725
write(1, "7fff05467000-7fff05496000 rw-p 0"..., 2725) = 2725
read(3, "", 131072)                     = 0
munmap(0x7f8d29ad4000, 139264)          = 0
close(3)                                = 0
close(1)                                = 0
close(2)                                = 0
exit_group(0)                           = ?
+++ exited with 0 +++

But isn't this supposed to produce inconsistent data according to the SO answer linked above?

I have also found some information about proc here, that seem to support the previous SO answer:

To see a precise snapshot of a moment, you can see /proc/<pid>/smaps file and scan page table.

Then later in the text it says:

Note: reading /proc/PID/maps or /proc/PID/smaps is inherently racy (consistent output can be achieved only in the single read call). This typically manifests when doing partial reads of these files while the memory map is being modified. Despite the races, we do provide the following guarantees:

1) The mapped addresses never go backwards, which implies no two regions will ever overlap.

2) If there is something at a given vaddr during the entirety of the life of the smaps/maps walk, there will be some output for it.

So it seems to me, I can only trust the data I'm getting if I get it in a single read() call. Which only returns a small chunk of data despite the buffer being big enough. Which in turn means there is actually no way to get a consistent snapshot of /proc/<pid>/smaps and the data returned by cat/using multiple read() calls may be garbage depending on the sun to moon light ratio?

Or does 2) actually mean I'm too hung up on the previous SO answer listed above?

like image 921
Soukyuu Avatar asked Jan 14 '20 16:01

Soukyuu


People also ask

What is proc PID Smaps?

The /proc/PID/smaps is an extension based on maps, showing the memory consumption for each of the process's mappings.

How do you read proc PID stack?

It corresponds to the /proc/[pid]/task/[tid]/path. Which seems to be what you are looking for. After your hints, I get the answer: on Linux, thread is actually a process, so /proc/[tid]/stack will get the thread's kernel stack info, or use /proc/[pid]/task/[tid]/stack .

Which filesystem is the proc directory mounted to?

The proc filesystem is a pseudo-filesystem which provides an interface to kernel data structures. It is commonly mounted at /proc.

What is proc pid exe?

/proc/[pid]/exe Under Linux 2.2 and later, this file is a symbolic link containing the actual pathname of the executed command. This symbolic link can be dereferenced normally; attempting to open it will open the executable.


1 Answers

You are being limited with the internal kernel buffer size in fs/seq_file.c, which is used to generate many /proc files.

Buffer is first set to be the size of a page, then is exponentially grown to fit at least one record, and then is crammed with as many entire records as will fit, but is not grown any more after being able needed to fit the first entry. When the internal buffer cannot fit any more entries, the read is ended.

like image 92
OhJeez Avatar answered Oct 09 '22 00:10

OhJeez