I have a test program. It takes about 37 seconds on Linux kernel 3.1.*, but only takes about 1 seconds on kernel 3.0.18 (I just replace the kernel on the same machine as before). Please give me a clue on how to improve it on kernel 3.1. Thanks!
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
int my_fsync(int fd)
{
// return fdatasync(fd);
return fsync(fd);
}
int main(int argc, char **argv)
{
int rc = 0;
int count;
int i;
char oldpath[1024];
char newpath[1024];
char *writebuffer = calloc(1024, 1);
snprintf(oldpath, sizeof(oldpath), "./%s", "foo");
snprintf(newpath, sizeof(newpath), "./%s", "foo.new");
for (count = 0; count < 1000; ++count) {
int fd = open(newpath, O_CREAT | O_TRUNC | O_WRONLY, S_IRWXU);
if (fd == -1) {
fprintf(stderr, "open error! path: %s\n", newpath);
exit(1);
}
for (i = 0; i < 10; i++) {
rc = write(fd, writebuffer, 1024);
if (rc != 1024) {
fprintf(stderr, "underwrite!\n");
exit(1);
}
}
if (my_fsync(fd)) {
perror("fsync failed!\n");
exit(1);
}
if (close(fd)) {
perror("close failed!\n");
exit(1);
}
if (rename(newpath, oldpath)) {
perror("rename failed!\n");
exit(1);
}
}
return 0;
}
# strace -c ./testfsync
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
98.58 0.068004 68 1000 fsync
0.84 0.000577 0 10001 write
0.40 0.000275 0 1000 rename
0.19 0.000129 0 1003 open
0.00 0.000000 0 1 read
0.00 0.000000 0 1003 close
0.00 0.000000 0 1 execve
0.00 0.000000 0 1 1 access
0.00 0.000000 0 3 brk
0.00 0.000000 0 1 munmap
0.00 0.000000 0 2 setitimer
0.00 0.000000 0 68 sigreturn
0.00 0.000000 0 1 uname
0.00 0.000000 0 1 mprotect
0.00 0.000000 0 2 writev
0.00 0.000000 0 2 rt_sigaction
0.00 0.000000 0 6 mmap2
0.00 0.000000 0 2 fstat64
0.00 0.000000 0 1 set_thread_area
------ ----------- ----------- --------- --------- ----------------
100.00 0.068985 14099 1 total
Kernel 3.1.* is actually doing the sync, 3.0.18 is faking it. Your code does 1,000 synchronized writes. Since you truncate the file, each write also enlarges the file. So you actually have 2,000 write operations. Typical hard drive write latency is about 20 milliseconds per I/O. So 2,000*20 = 40,000 milliseconds or 40 seconds. So it seems about right, assuming you're writing to a typical hard drive.
Basically, by syncing after each write, you give the kernel no ability to efficiently cache or overlap the writes and force worst-case behavior on every operation. Also, the hard drive winds up having to seek back and forth between where the data is written and where the metadata is written once for each write.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With