I've run into the need to be able refer to a directory by path given its file descriptor in Linux. The path doesn't have to be canonical, it just has to be functional so that I can pass it to other functions. So, taking the same parameters as passed to a function like fstatat()
, I need to be able to call a function like getxattr()
which doesn't have a f-XYZ-at()
variant.
So far I've come up with these solutions; though none are particularly elegant.
The simplest solution is to avoid the problem by calling openat()
and then using a function like fgetxattr()
. This works, but not in every situation. So another method is needed to fill the gaps.
The next solution involves looking up the information in proc:
if (!access("/proc/self/fd",X_OK)) {
sprintf(path,"/proc/self/fd/%i/",fd);
}
This, of course, totally breaks on systems without proc, including some chroot environments.
The last option, a more portable but potentially-race-condition-prone solution, looks like this:
DIR* save = opendir(".");
fchdir(fd);
getcwd(path,PATH_MAX);
fchdir(dirfd(save));
closedir(save);
The obvious problem here is that in a multithreaded app, changing the working directory around could have side effects.
However, the fact that it works is compelling: if I can get the path of a directory by calling fchdir()
followed by getcwd()
, why shouldn't I be able to just get the information directly: fgetcwd()
or something. Clearly the kernel is tracking the necessary information.
So how do I get to it?
The way Linux implements getcwd
in the kernel is this: it starts at the directory entry in question and prepends the name of the parent of that directory to the path string, and repeats that process until it reaches the root. This same mechanism can be theoretically implemented in user-space.
Thanks to Jonathan Leffler for pointing this algorithm out. Here is a link to the kernel implementation of this function: https://github.com/torvalds/linux/blob/v3.4/fs/dcache.c#L2577
You can use readlink on /proc/self/fd/NNN where NNN is the file descriptor. This will give you the name of the file as it was when it was opened — however, if the file was moved or deleted since then, it may no longer be accurate (although Linux can track renames in some cases).
A file descriptor is an unsigned integer used by a process to identify an open file. The number of file descriptors available to a process is limited by the /OPEN_MAX control in the sys/limits. h file. The number of file descriptors is also controlled by the ulimit -n flag.
The kernel thinks of directories differently from the way you do - it thinks in terms of inode numbers. It keeps a record of the inode number (and device number) for the directory, and that is all it needs as the current directory. The fact that you sometimes specify a name to it means it goes and tracks down the inode number corresponding to that name, but it preserves only the inode number because that's all it needs.
So, you will have to code a suitable function. You can open a directory directly with open()
precisely to get a file descriptor that can be used by fchdir()
; you can't do anything else with it on many modern systems. You can also fail to open the current directory; you should be testing that result. The circumstances where this happens are rare, but not non-existent. (A SUID program might chdir()
to a directory that the SUID privileges permit, but then drop the SUID privileges leaving the process unable to read the directory; the getcwd()
call will fail in such circumstances too - so you must error check that, too!) Also, if a directory is removed while your (possibly long-running) process has it open, then a subsequent getcwd()
will fail.
Always check results from system calls; there are usually circumstances where they can fail, even though it is dreadfully inconvenient of them to do so. There are exceptions - getpid()
is the canonical example - but they are few and far between. (OK: not all that far between - getppid()
is another example, and it is pretty darn close to getpid()
in the manual; and getuid()
and relatives are also not far off in the manual.)
Multi-threaded applications are a problem; using chdir()
is not a good idea in those. You might have to fork()
and have the child evaluate the directory name, and then somehow communicate that back to the parent.
bignose asks:
This is interesting, but seems to go against the querent's reported experience: that getcwd knows how to get the path from the fd. That indicates that the system knows how to go from fd to path in at least some situations; can you edit your answer to address this?
For this, it helps to understand how - or at least one mechanism by which - the getcwd()
function can be written. Ignoring the issue of 'no permission', the basic mechanism by which it works is:
Here is an implementation of that algorithm. It is old code (originally 1986; the last non-cosmetic changes were in 1998) and doesn't make use of fchdir()
as it should. It also works horribly if you have NFS automounted file systems to traverse - which is why I don't use it any more. However, this is roughly equivalent to the basic scheme used by getcwd()
. (Ooh; I see a 18 character string ("../123456789.abcd") - well, back when it was written, the machines I worked on only had the very old 14-character only filenames - not the modern flex names. Like I said, it is old code! I haven't seen one of those file systems in what, 15 years or so - maybe longer. There is also some code to mess with longer names. Be cautious using this.)
/*
@(#)File: $RCSfile: getpwd.c,v $
@(#)Version: $Revision: 2.5 $
@(#)Last changed: $Date: 2008/02/11 08:44:50 $
@(#)Purpose: Evaluate present working directory
@(#)Author: J Leffler
@(#)Copyright: (C) JLSS 1987-91,1997-98,2005,2008
@(#)Product: :PRODUCT:
*/
/*TABSTOP=4*/
#define _POSIX_SOURCE 1
#include "getpwd.h"
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#if defined(_POSIX_SOURCE) || defined(USG_DIRENT)
#include "dirent.h"
#elif defined(BSD_DIRENT)
#include <sys/dir.h>
#define dirent direct
#else
What type of directory handling do you have?
#endif
#define DIRSIZ 256
typedef struct stat Stat;
static Stat root;
#ifndef lint
/* Prevent over-aggressive optimizers from eliminating ID string */
const char jlss_id_getpwd_c[] = "@(#)$Id: getpwd.c,v 2.5 2008/02/11 08:44:50 jleffler Exp $";
#endif /* lint */
/* -- Routine: inode_number */
static ino_t inode_number(char *path, char *name)
{
ino_t inode;
Stat st;
char buff[DIRSIZ + 6];
strcpy(buff, path);
strcat(buff, "/");
strcat(buff, name);
if (stat(buff, &st))
inode = 0;
else
inode = st.st_ino;
return(inode);
}
/*
-- Routine: finddir
Purpose: Find name of present working directory
Given:
In: Inode of current directory
In: Device for current directory
Out: pathname of current directory
In: Length of buffer for pathname
Maintenance Log
---------------
10/11/86 JL Original version stabilised
25/09/88 JL Rewritten to use opendir/readdir/closedir
25/09/90 JL Modified to pay attention to length
10/11/98 JL Convert to prototypes
*/
static int finddir(ino_t inode, dev_t device, char *path, size_t plen)
{
register char *src;
register char *dst;
char *end;
DIR *dp;
struct dirent *d_entry;
Stat dotdot;
Stat file;
ino_t d_inode;
int status;
static char name[] = "../123456789.abcd";
char d_name[DIRSIZ + 1];
if (stat("..", &dotdot) || (dp = opendir("..")) == 0)
return(-1);
/* Skip over "." and ".." */
if ((d_entry = readdir(dp)) == 0 ||
(d_entry = readdir(dp)) == 0)
{
/* Should never happen */
closedir(dp);
return(-1);
}
status = 1;
while (status)
{
if ((d_entry = readdir(dp)) == 0)
{
/* Got to end of directory without finding what we wanted */
/* Probably a corrupt file system */
closedir(dp);
return(-1);
}
else if ((d_inode = inode_number("..", d_entry->d_name)) != 0 &&
(dotdot.st_dev != device))
{
/* Mounted file system */
dst = &name[3];
src = d_entry->d_name;
while ((*dst++ = *src++) != '\0')
;
if (stat(name, &file))
{
/* Can't stat this file */
continue;
}
status = (file.st_ino != inode || file.st_dev != device);
}
else
{
/* Ordinary directory hierarchy */
status = (d_inode != inode);
}
}
strncpy(d_name, d_entry->d_name, DIRSIZ);
closedir(dp);
/**
** NB: we have closed the directory we are reading before we move out of it.
** This means that we should only be using one extra file descriptor.
** It also means that the space d_entry points to is now invalid.
*/
src = d_name;
dst = path;
end = path + plen;
if (dotdot.st_ino == root.st_ino && dotdot.st_dev == root.st_dev)
{
/* Found root */
status = 0;
if (dst < end)
*dst++ = '/';
while (dst < end && (*dst++ = *src++) != '\0')
;
}
else if (chdir(".."))
status = -1;
else
{
/* RECURSE */
status = finddir(dotdot.st_ino, dotdot.st_dev, path, plen);
(void)chdir(d_name); /* We've been here before */
if (status == 0)
{
while (*dst)
dst++;
if (dst < end)
*dst++ = '/';
while (dst < end && (*dst++ = *src++) != '\0')
;
}
}
if (dst >= end)
status = -1;
return(status);
}
/*
-- Routine: getpwd
Purpose: Evaluate name of current directory
Maintenance Log
---------------
10/11/86 JL Original version stabilised
25/09/88 JL Short circuit if pwd = /
25/09/90 JL Revise interface; check length
10/11/98 JL Convert to prototypes
Known Bugs
----------
1. Uses chdir() and could possibly get lost in some other directory
2. Can be very slow on NFS with automounts enabled.
*/
char *getpwd(char *pwd, size_t plen)
{
int status;
Stat here;
if (pwd == 0)
pwd = malloc(plen);
if (pwd == 0)
return (pwd);
if (stat("/", &root) || stat(".", &here))
status = -1;
else if (root.st_ino == here.st_ino && root.st_dev == here.st_dev)
{
strcpy(pwd, "/");
status = 0;
}
else
status = finddir(here.st_ino, here.st_dev, pwd, plen);
if (status != 0)
pwd = 0;
return (pwd);
}
#ifdef TEST
#include <stdio.h>
/*
-- Routine: main
Purpose: Test getpwd()
Maintenance Log
---------------
10/11/86 JL Original version stabilised
25/09/90 JL Modified interface; use GETCWD to check result
*/
int main(void)
{
char pwd[512];
int pwd_len;
if (getpwd(pwd, sizeof(pwd)) == 0)
printf("GETPWD failed to evaluate pwd\n");
else
printf("GETPWD: %s\n", pwd);
if (getcwd(pwd, sizeof(pwd)) == 0)
printf("GETCWD failed to evaluate pwd\n");
else
printf("GETCWD: %s\n", pwd);
pwd_len = strlen(pwd);
if (getpwd(pwd, pwd_len - 1) == 0)
printf("GETPWD failed to evaluate pwd (buffer is 1 char short)\n");
else
printf("GETPWD: %s (but should have failed!!!)\n", pwd);
return(0);
}
#endif /* TEST */
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With