What throttles the fwrite() calls to a full disk on linux?

Tags:

It appears that if my program attempts to write to a full file system it it will initially error with 'No space left on device' nearly instantly, but if I leave it running for a minute or so it slows down by a couple orders of magnitude.

Note this is a minimum test case, this behaviour was first noticed in a Java logging framework, hence the small (512 byte) output chunks and a lot of them.

main.c

#include <stdlib.h>
#include <stdio.h>
#include <stdbool.h>
#include <stddef.h>
#include <time.h>
#include <sys/time.h>


long microTime() {
    struct timeval time;
    gettimeofday(&time, NULL);

    return time.tv_sec * 1000 * 1000 + time.tv_usec;
}

int main(int argc, const char * argv[]) {

    char *payload ;
    payload = (char *)malloc(512 * sizeof(char));

    if (payload == NULL) {
        printf("Failed to alloc memory\n");
        exit(1);
    }

    FILE *fp;
    fp = fopen(argv[1], "a+");

    printf("opened [%s]\n", argv[1]);

    if (fp == NULL) {
        printf("Failed to open [%s]\n", argv[1]);
        exit(1);
    }

    for (;;) {
        int batchSize = 100000;
        bool errored = false;
        long runStart = microTime();
        for (int i = 0; i < batchSize; i ++) {
            size_t result = fwrite(payload, 512 * sizeof(char), 1, fp);
            if (result == 0 && !errored) {
                perror("Failed to write to disk");
                errored = true;
            }

        }
        long runEnd = microTime();
        printf("total elapsed %dms\n", (int)(runEnd - runStart) / 1000);
    }

    return 0;
}

(please excuse my C, this is probably the first C program I've written in nearly 20 years)

Running:

gcc -std=c99 main.c && ./a.out /path/to/somewhere/file1.bin

Warning this program will fill your disk partition

output:

total elapsed 42ms
total elapsed 105ms
total elapsed 104ms
total elapsed 125ms
... skip until the disk fills
Failed to write to disk: No space left on device
total elapsed 104ms
Failed to write to disk: No space left on device
total elapsed 76ms
Failed to write to disk: No space left on device
total elapsed 84ms
Failed to write to disk: No space left on device
... then skip a little more about one minute
total elapsed 8096ms
Failed to write to disk: No space left on device
total elapsed 43245ms
Failed to write to disk: No space left on device
total elapsed 48670ms
Failed to write to disk: No space left on device
total elapsed 45929ms
Failed to write to disk: No space left on device

My expectation is that this program would run for ever with a fairly low constant time to write to disk.

I've run this on a local centos 6.4 vagrant box, an amazon linux and an ubuntu 14.04 ec2 boxes with exactly the same results. Interestingly it doesn't seem to happen on OSX 10.9.5 attempting to fill a disk image.

So my question is really, what is causing this apparent throttling?

update: Run with strace -t -T

10:27:38 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 <0.000011>
10:27:38 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 <0.000011>
10:27:38 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = -1 ENOSPC (No space left on device) <0.000066>
10:27:38 dup(2)                         = 4 <0.000006>
10:27:38 fcntl(4, F_GETFL)              = 0x8002 (flags O_RDWR|O_LARGEFILE) <0.000011>
10:27:38 fstat(4, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0 <0.000006>
10:27:38 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fa8e97f2000 <0.000009>
10:27:38 lseek(4, 0, SEEK_CUR)          = -1 ESPIPE (Illegal seek) <0.000005>
10:27:38 write(4, "Failed to write to disk: No spac"..., 49Failed to write to disk: No space left on device
) = 49 <0.000006>
10:27:38 close(4)                       = 0 <0.000006>
10:27:38 munmap(0x7fa8e97f2000, 4096)   = 0 <0.000015>
10:27:38 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = -1 ENOSPC (No space left on device) <0.000026>
10:27:38 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = -1 ENOSPC (No space left on device) <0.000017>
10:27:38 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = -1 ENOSPC (No space left on device) <0.000016>
10:27:38 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = -1 ENOSPC (No space left on device) <0.000016>
10:27:38 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = -1 ENOSPC (No space left on device) <0.000015>
10:27:38 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = -1 ENOSPC (No space left on device) <0.000016>
10:27:38 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = -1 ENOSPC (No space left on device) <0.000015>
10:27:38 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = -1 ENOSPC (No space left on device) <0.000015>
10:27:38 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = -1 ENOSPC (No space left on device) <0.000016>
10:27:38 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = -1 ENOSPC (No space left on device) <0.000016>

... skipping to the end
10:30:02 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = -1 ENOSPC (No space left on device) <0.005747>
10:30:02 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = -1 ENOSPC (No space left on device) <0.005231>
10:30:02 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = -1 ENOSPC (No space left on device) <0.005496>
10:30:02 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = -1 ENOSPC (No space left on device) <0.005870>
10:30:02 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = -1 ENOSPC (No space left on device) <0.005823>
10:30:02 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = -1 ENOSPC (No space left on device) <0.005841>

not the call times to write() go from ~0.00001s => ~0.005s by the end of the run.

The full run.log is 250mb

update 2: Adding CPU usage detail for the various phases:

Staring the run, ie filling the disk

top - 11:00:56 up  2:52,  2 users,  load average: 1.79, 0.99, 0.48
Tasks:  75 total,   3 running,  72 sleeping,   0 stopped,   0 zombie
Cpu(s):  7.3%us, 71.8%sy,  0.0%ni,  0.0%id,  0.0%wa, 14.6%hi,  6.3%si,  0.0%st
Mem:    603764k total,   561868k used,    41896k free,   134976k buffers
Swap:  1254392k total,        0k used,  1254392k free,   359020k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 9861 vagrant   20   0  4056  500  412 R 54.5  0.1   0:01.88 a.out
  766 root      20   0     0    0    0 R 36.9  0.0   0:28.51 flush-8:0
   28 root      20   0     0    0    0 S  5.6  0.0   0:05.14 kswapd0
   16 root      20   0     0    0    0 S  2.3  0.0   4:51.07 kblockd/0

Disk full, fast erroring

top - 11:01:11 up  2:52,  2 users,  load average: 1.68, 1.01, 0.49
Tasks:  75 total,   2 running,  73 sleeping,   0 stopped,   0 zombie
Cpu(s):  9.0%us, 91.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:    603764k total,   555424k used,    48340k free,   134976k buffers
Swap:  1254392k total,        0k used,  1254392k free,   352224k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 9861 vagrant   20   0  4056  552  464 R 99.5  0.1   0:12.91 a.out
  988 root      20   0  215m 1572  860 S  0.3  0.3   0:05.05 VBoxService
    1 root      20   0 19228 1348 1072 S  0.0  0.2   0:00.24 init
    2 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kthreadd

Slow erroring

top - 11:03:03 up  2:54,  2 users,  load average: 1.63, 1.14, 0.59
Tasks:  74 total,   3 running,  71 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us,  0.4%sy,  0.0%ni,  0.0%id, 98.8%wa,  0.4%hi,  0.4%si,  0.0%st
Mem:    603764k total,   555284k used,    48480k free,   134976k buffers
Swap:  1254392k total,        0k used,  1254392k free,   352308k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
  216 root      20   0     0    0    0 R  3.7  0.0   4:50.72 jbd2/sda1-8
   16 root      20   0     0    0    0 R  3.3  0.0   4:53.04 kblockd/0
 9861 vagrant   20   0  4056  552  464 D  1.3  0.1   1:17.47 a.out
    1 root      20   0 19228 1348 1072 S  0.0  0.2   0:00.24 init

589

asked Jan 19 '15 09:01

Gareth Davis

1 Answers

You may be observing the effects of Linux virtual memory when write system calls block and become synchronous. This is because your application continuously generates data to be written to disk and the disk cannot store the data as fast.

Alternatively, the background flush out kicks in sometime after the disk was filled, so that write syscalls compete for a access with the kernel flush out thread to the filesystem internal structures, so that it takes longer to return ENOSPC.

See Better Linux Disk Caching & Performance with vm.dirty_ratio & vm.dirty_background_ratio:

vm.dirty_background_ratio is the percentage of system memory that can be filled with “dirty” pages — memory pages that still need to be written to disk — before the pdflush/flush/kdmflush background processes kick in to write it to disk. My example is 10%, so if my virtual server has 32 GB of memory that’s 3.2 GB of data that can be sitting in RAM before something is done.

vm.dirty_ratio is the absolute maximum amount of system memory that can be filled with dirty pages before everything must get committed to disk. When the system gets to this point all new I/O blocks until dirty pages have been written to disk. This is often the source of long I/O pauses, but is a safeguard against too much data being cached unsafely in memory.

answered Oct 13 '22 23:10

Maxim Egorushkin

Related questions
                            
                                Are C/C++/ObjC/Swift/JS Apple's only allowed languages for iPhone development?
                            
                                How to fill polygon with different color than boundary?
                            
                                Type of unsigned bit-fields: int or unsigned int
                            
                                Not receiving SIGCHLD for processes executed with sudo
                            
                                String array not deallocated upon scope exit
                            
                                Implementation of MVar in C?
                            
                                Change the Theme of Doxygen?
                            
                                Interpreting strace output
                            
                                How to use Callback function to get language in Tizen?
                            
                                AST for multiple source files with clang
                            
                                Qemu, div by zero, mxcsr register
                            
                                What is the correct way to build a thread-safe, multiplatform C library?
                            
                                Is there a realloc equivalent in C++?
                            
                                C PHP Extension object persistance
                            
                                evaluating/accessing a structure
                            
                                What's wrong with using associativity by compilers?
                            
                                LLVM - Run Own Pass automatically with clang
                            
                                Importing a PKCS12 Using SecItemImport
                            
                                Is there a particular reason for memmem being a GNU extension?
                            
                                Why is isascii() deprecated?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What throttles the fwrite() calls to a full disk on linux?

Tags:

c

linux

Gareth Davis

People also ask

1 Answers

Maxim Egorushkin

Recent Activity

Donate For Us