Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

write () call failed: No space left on device: ENOSPC handling

The write() call is failing with errno = 28 (ENOSPC), no space left on the device. I am trying to handle this error in the following way. When the disk is full, I am doing lseek() to move the file pointer to the beginning of the file.

I believe that now the write() should not fail, since now the file will be overwritten from the top (File will not exapnd). But still the write() call is failing with the same error. Please explain this behaviour.

  if(errno == ENOSPC)
  {
      curPos = lseek(gi4LogFd, 0, SEEK_SET);
      break;
  }
like image 695
Sumit Trehan Avatar asked Dec 25 '22 23:12

Sumit Trehan


2 Answers

Just because you write to the beginning of the file doesn't mean that the file system will be writing to the same space on the disk or that the space at the beginning of the file is allocated at all.

You could have a hole in the file, in which case the write will fail anyway. Holes are optimizations that many filesystems do where they pretend that a piece of the file is there while it's actually just lots of zeros, so those parts never get written to disk, it's just bookkeeping saying that a particular part of the file is empty.

You could have overcommitted data to your filesystem (many filesystems don't actually allocate space on the disk until the data is flushed from the buffer cache which could be several seconds, if not minutes after the write is done), in which case the write will fail anyway. The ENOSPC you're getting could actually be because you already filled your filesystem to over 100% of the capacity and the filesystem code didn't discover it until it tried to flush a write you did a while ago.

You could be on a logging/journaling filesystem where the actual block allocation doesn't happen until the log is flushed, in which case the write will fail. Same logic as the buffer cache situation.

You could have run out of some particular preallocated metadata on the filesystem and it will fail with ENOSPC even though it's not even nearly full. This is not nearly as common today as in the past.

Your disk might have discovered that some part of it went bad and told the filesystem to not use those blocks and that took up space.

In short, there's no guarantee that the filesystem will behave like we could naively think it does once it's full. There are other reasons besides this to never fill a filesystem above 95%. Almost all filesystems are notoriously nondeterministic when nearly full.

like image 78
Art Avatar answered Dec 29 '22 11:12

Art


Just because you are seeking to the beginning of the file doesn't mean that the file is truncated. It's possible to do random writes on a file.

Writing down a block on a full file system is going to cause issues. If you want to truncate the file use truncate(2) or ftruncate call on the file BEFORE lseek

try:

    if(errno == ENOSPC) {
        ftruncate(gi4LogFd, 0);
        lseek(gi4LogFd, 0, SEEK_SET);
        break;
    }

Okay so ext3 filesystem with journaling support does not create an issue on a full fs:

Setup:

Create an image file:

   dd if=/dev/zero of=/tmp/img.dsk count=8192

Created an ext3 filesystem on a 4k image file:

mkfs.ext3 /tmp/img.dsk
sudo mount /tmp/img.dsk /mnt/internal
sudo chown masud.users /mnt/internal

touch /mnt/internal/file.bin 

sudo dd if=/dev/urandom of=/mnt/internal/file.bin

here sudo is necessary for dd to make sure that the reserve for superuser is filled up.

so now :

df /mnt/internal/ shows:

/dev/loop/0         3963  3963         0 100% /mnt/internal

Using the following code:

#include <stdio.h>
#include <sys/time.h>
#include <stdlib.h>
#include <errno.h>
#include <unistd.h>
#include <fcntl.h>

char buf[8192];

int main(int argc, char *argv[])
{

    int rv;
    char *filename;

    if ( argc < 2 ) {
         fprintf(stderr, "provide the filename\n");
          return -1;
    }

    filename = argv[1];
    int rd = open("/dev/urandom", O_RDONLY);
    read(rd, buf, sizeof(buf));
    close(rd);

    int fd = open(filename, O_SYNC|O_RDWR);

    lseek(fd, -sizeof(buf), SEEK_END);
    rv = write(fd, buf, sizeof(buf));

    if ( rv < 0 ) {
        perror(filename);
        goto out;
    }
    lseek(fd, 0, SEEK_SET);

    rv = write(fd, "foo", 3);
    if ( rv < 0 ) {
       perror(filename);
    }
out:

   close(fd);
   return rv;


}

Now: ./foo /mnt/internal/file.bin

Succeeds.

So question is how is this different from your environment?

like image 42
Ahmed Masud Avatar answered Dec 29 '22 11:12

Ahmed Masud