Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Non-blocking access to the file system

When writing a non-blocking program (handling multiple sockets) which at a certain point needs to open files using open(2), stat(2) files or open directories using opendir(2), how can I ensure that the system calls do not block?

To me it seems that there's no other alternative than using threads or fork(2).

like image 736
watain Avatar asked Jan 08 '13 08:01

watain


2 Answers

As Mel Nicholson replied, for everything file descriptor based you can use select/poll/epoll. For everything else you can have a proxy thread-per-item (or a thread pool) with the small stack that would convert (by means of the kernel scheduler) any synchronous blocking waits to select/poll/epoll-able asynchronous events using eventfd or a unix pipe (where portability is required).

The proxy thread shall block till the operation completes and then write to the eventfd or to the pipe to wake up the select/poll/epoll.

like image 102
bobah Avatar answered Oct 11 '22 08:10

bobah


Indeed there is no other method.

Actually there is another kind of blocking that can't be dealt with other than by threads and that is page faults. Those may happen in program code, program data, memory allocation or data mapped from files. It's almost impossible to avoid them (actually you can lock some pages to memory, but it's privileged operation and would probably backfire by making the kernel do a poor job of memory management somewhere else). So:

  1. You can't really weed out every last chance of blocking for a particular client, so don't bother with the likes of open and stat. The network will probably add larger delays than these functions anyway.
  2. For optimal performance you should have enough threads so some can be scheduled if the others are blocked on page fault or similar difficult blocking point.

Also if you need to read and process or process and write data during handling a network request, it's faster to access the file using memory-mapping, but that's blocking and can't be made non-blocking. So modern network servers tend to stick with the blocking calls for most stuff and simply have enough threads to keep the CPU busy while other threads are waiting for I/O.

The fact that most modern servers are multi-core is another reason why you need multiple threads anyway.

like image 44
Jan Hudec Avatar answered Oct 11 '22 09:10

Jan Hudec