Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Semantics of Linux O_PATH file descriptors?

Tags:

c

linux

posix

Linux 2.6.39 introduced O_PATH open mode, which (roughly speaking) doesn't really open the file at all (i.e. doesn't create an open file description), but just gives a file descriptor that's a handle to the unopened target. Its main use is as an argument to the *at functions (openat, etc.), and it seems to be suitable as an implementation of the POSIX 2008 O_SEARCH functionality which Linux was previously missing. However, I've been unable to find any good documentation on the exact semantics of O_PATH. A couple specific questions I have are:

  1. What operations are possible on Linux O_PATH file descriptors? (Only *at functions?)
  2. Is O_PATH ever useful with non-directories?
  3. How is the file descriptor bound to the underlying filesystem object, and what happens if it's moved, deleted, etc.? Does an O_PATH file descriptor count as a reference that prevents the object from being freed when the last link is unlinked? Etc.
like image 265
R.. GitHub STOP HELPING ICE Avatar asked Sep 14 '12 01:09

R.. GitHub STOP HELPING ICE


1 Answers

File descriptors obtained using open(directory, O_PATH | O_DIRECTORY) are not only useful for ...at() functions, but for fchdir() (since kernel version 3.2.23, I believe).

There is also a recent patch for a new syscall, fbind(), that would allow very long Unix domain socket names. The socket file is first created using mknod(path, mode | S_IFSOCK, (dev_t)0), then opened using open(file, O_PATH). The file descriptor thus obtained, and a Unix domain socket descriptor, is passed to fbind(), to bind the socket to the pathname. Whether this will be included in the Linux kernel is yet to be seen -- although even if it is, it will be years before one can rely on it being universally available. (As a workaround for too-long Unix domain socket names it would be viable sooner, though.)

I'd say O_PATH is only useful for directories for now; file uses may be found in the future. Other than the possibility of a future fbind(), or similar future syscalls, I don't know of any use of file descriptors for files opened using O_PATH. Even fstatvfs() won't work, on a 3.5.0 kernel at least.

In Linux, inodes (file contents and metadata) are freed only when the last open file descriptor is closed. When removing (unlinking) a file, you only remove the file name associated with the inode. So, there are two separate filesystem objects associated with a file descriptor: the name used to open the object, and the underlying inode referred to. The name is only used for path resolution, i.e. when open() (or equivalent) is called. All data and metadata is in the inode.

File descriptors obtained using O_PATH behave (at least on kernel 3.5.0) just like normal file descriptors wrt. moving and renaming the name or name components used to open the descriptor. (The descriptor stays valid, as it refers to the inode, and the file name object was used only during path resolution. Holding the descriptor open will keep the inode resources allocated, even if the descriptor was opened O_PATH.)

like image 129
Nominal Animal Avatar answered Sep 27 '22 21:09

Nominal Animal