Can please summarize the events/steps that happen when I try to execute a read()/write() system call. How does the kernel know which file system to issue these commands.
Lets say a process calls write(). Then It will call sys_write().
Now probably, since sys_write() is executed on behalf of the current process, it can access the struct task_struct and hence it can access the struct files_struct and struct fs_struct which contains file system information.
But after that I am not seeing, how this fs_struct is helping to identify the file system.
Edit: Now that Alex has described the flow...I have still doubt how the read/write are getting routed to a FS, since the VFS does not do it, then it must be happening somewhere else, Also how is the underlying block device and then finally the hardware protocol PCI/USB getting attached.
A simple flow chart involving actual data structures would be helpful
Please help.
read() The read() system call is used to access data from a file that is stored in the file system. The file to read can be identified by its file descriptor and it should be opened using open() before it can be read.
For issues such as “Why can't the software run on this machine,” strace is still a powerful system call tracer in Linux. But to trace the latency of system calls, the BPF-based perf-trace is a better option. In containers or K8s environments that use cgroup v2, traceloop is the easiest to use. Try TiDB Cloud Free Now!
There are five system calls that generate file descriptors: create, open, fcntl, dup and pipe.
This answer is based on kernel version 4.0. I traced out some of the code which handles a read
syscall. I recommend you clone the Linux source repo and follow along in the source code.
read
, at fs/read_write.c:620
is called. It receives a file descriptor (integer) as an argument, and calls fdget_pos
to convert it to a struct fd
.fdget_pos
calls __fdget_pos
calls __fdget
calls __fget_light
. __fget_light
uses current->files
, the file descriptor table for the current process, to look up the struct file
which corresponds to the passed file descriptor number.vfs_read
, at fs/read_write.c:478
.vfs_read
calls __vfs_read
, which calls file->f_op->read
. From here on, you are in filesystem-specific code.So the VFS doesn't really bother "identifying" the filesystem which a file lives on; it simply uses the table of "file operation" function pointers which is stored in its struct file
. When that struct file
is initialized, it is given the correct f_op
function pointer table which implements all the filesystem-specific operations for its filesystem.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With