Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there anything like shm_open() without filename?

The POSIX shm_open() function returns a file descriptor that can be used to access shared memory. This is extremely convenient because one can use all the traditional mechanisms for controlling file descriptors to also control shared memory.

The only drawback is that shm_open() always wants a filename. So I need to do this:

// Open with a clever temp file name and hope for the best.
fd = shm_open(tempfilename, O_RDWR | O_CREAT | O_EXCL, 0600);

// Immediately delete the temp file to keep the shm namespace clean.
shm_unlink(tempfilename);

// Then keep using fd -- the shm object remains as long as there are open fds.

This use of tempfilename is difficult to do portably and reliably. The interpretation of the filename (what the namespace is, how permissions are handled) differs among systems.

In many situations the processes using the shared memory object have no need for a filename because the object can be accessed more simply and safely by just passing a file descriptor from one process to another. So is there something that's just like shm_open() but can be used without touching the shared memory filename namespace?

mmap() with MAP_ANON|MAP_SHARED is great but instead of a file descriptor it gives a pointer. The pointer doesn't survive over an exec boundary and can't be sent to another process over a Unix domain socket like file descriptors can.

The file descriptor returned by shm_open() also doesn't survive an exec boundary by default: the POSIX definition says that the FD_CLOEXEC file descriptor flag associated with the new file descriptor is set. But it is possible to clear the flag using fcntl() on MacOS, Linux, FreeBSD, OpenBSD, NetBSD, DragonFlyBSD and possibly other operating systems.

like image 710
Lassi Avatar asked Apr 16 '19 09:04

Lassi


People also ask

How shm_ open works?

Typically, a call to fopen() returns a file descriptor which is passed to mmap() to create the file's memory map. shm_open , apparently, works in the same way. It returns a file descriptor which can even be used with regular file operations (e.g ftruncate , ftell , fseek ...etc).

What is Memfd?

Description. Memfd is a wrapper around the memfd_create system call which creates an anonymous memory-backed file and returns a file descriptor reference to it. It provides a simple alternative to manually mounting a tmpfs filesystem and creating and opening a file in that filesystem.


2 Answers

A library to solve the problem

I managed to write a library that provides the simple interface:

int shm_open_anon(void);

The library compiles without warnings and successfully runs a test program on Linux, Solaris, MacOS, FreeBSD, OpenBSD, NetBSD, DragonFlyBSD and Haiku. You may be able to adapt it to other operating systems; please send a pull request if you do.

The library returns a file descriptor with the close-on-exec flag set. You can clear that flag using fcntl() on all supported operating systems, which will allow you to pass the fd over exec(). The test program demonstrates that this works.

Implementation techniques used in the library

The readme of the library has very precise notes on what was done and what wasn't done for each OS. Here's a summary of the main stuff.

There are several non-portable things that are more or less equivalent to shm_open() without a filename:

  • FreeBSD can take SHM_ANON as the pathname for shm_open() since 2008.

  • Linux has a memfd_create() system call since kernel version 3.17.

  • Earlier versions of Linux can use mkostemp(name, O_CLOEXEC | O_TMPFILE) where name is something like /dev/shm/XXXXXX. Note that we are not using shm_open() at all here -- mkostemp() is implicitly using a perfectly ordinary open() call. Linux mounts a special memory-backed file system in /dev/shm but some distros use /run/shm instead so there are pitfalls here. And you still have to shm_unlink() the temp file.

  • OpenBSD has a shm_mkstemp() call since release 5.4. You still have to shm_unlink() the temp file but at least it is easy to create safely.

For other OSes, I did the following:

  1. Figure out an OS-dependent format for the name argument of POSIX shm_open(). Please note that there is no name you can pass that is absolutely portable. For example, NetBSD and DragonFlyBSD have conflicting demands about slashes in the name. This applies even if your goal is to use a named shm object (for which the POSIX API was designed) instead of an anonymous one (as we are doing here).

  2. Append some random letters and numbers to the name (by reading from /dev/random). This is basically what mktemp() does, except we don't check whether our random name exists in the file system. The interpretation of the name argument varies wildly so there's no reasonable way to portably map it to an actual filename. Also Solaris doesn't always provide mktemp(). For all practical purposes, the randomness we put in will ensure a unique name for the fraction of a second that we need it.

  3. Open the shm object with that name via shm_open(name, O_RDWR | O_CREAT | O_EXCL | O_NOFOLLOW, 0600). In the astronomical chance that our random filename already exists, O_EXCL will cause this call to fail anyway, so no harm done. The 0600 permissions (owner read-write) are necessary on some systems instead of blank 0 permissions.

  4. Immediately call shm_unlink() to get rid of the random name. The file descriptor remains for our use.

This technique is not quaranteed to work by POSIX, but:

  1. The shm_open() name argument is underspecified by POSIX so nothing else is guaranteed to work either.
  2. I'll let the above compatibility list speak for itself.

Enjoy.

like image 77
Lassi Avatar answered Oct 07 '22 06:10

Lassi


No, there isn't. Since both System V shared memory model and POSIX shared file mapping for IPC require operations with a file, there is always need for a file in order to do mapping.

mmap() with MAP_ANON|MAP_SHARED is great but instead of a file descriptor it gives a pointer. The pointer doesn't survive over an exec boundary and can't be sent to another process over a Unix domain socket like file descriptors can.

As John Bollinger says,

Neither memory mappings created via mmap() nor POSIX shared-memory segments obtained via shm_open() nor System V shared-memory segments obtained via shmat() are preserved across an exec.

There must be a well-known place on the memory to meet and exchange information. That's why a file is the requirement. By doing this, after exec, the child is able to reconnect to the appropriate shared memory.

like image 4
snr Avatar answered Oct 07 '22 06:10

snr