Given a file descriptor or file name, how do I know if I can write to an arbitrary location without waiting for the intervening part be explicitly zeroed out on disk?

You can <code>stat()</code> the file to obtain file size and number of disk blocks, seek a relatively small number of disk blocks past the end of the file, write a known number of blocks, then stat the file again. Compare the original number of disk blocks to the final number. Just a few disk blocks shouldn't take too long to write if the file system doesn't support sparse file. Given the original and final number of disk blocks, then try to determine if the file system supports sparse files. I say "try" because some filesystems can make this hard - for example, ZFS with compression enabled. Something like this: <pre class="prettyprint"><code>#include <unistd.h> #include <stdlib.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <stdio.h> #include <string.h> #include <errno.h> int check( const char *filename ) { struct stat sb; long blocksize; off_t filesize; blkcnt_t origblocks; char *buffer; int fd; fd = open( filename, O_CREAT | O_RDWR, 0644 ); fstat( fd, &sb ); blocksize = sb.st_blksize; filesize = sb.st_size; origblocks = sb.st_blocks; lseek( fd, 16UL * blocksize, SEEK_END ); buffer = malloc( blocksize ); memset( buffer, 0xAA, blocksize ); write( fd, buffer, blocksize ); fsync( fd ); free( buffer ); // kludge to give ZFS time to update metadata for ( ;; ) { stat( filename, &sb ); if ( sb.st_blocks != origblocks ) { break; } } printf( "file: %s\n filesystem: %s\n blocksize: %d\n size: %zd\n" " blocks: %zd\n orig blocks: %zd\n disk space: %zd\n", filename, sb.st_fstype, blocksize, sb.st_size, ( size_t ) sb.st_blocks, ( size_t ) origblocks, ( size_t ) ( 512UL * sb.st_blocks ) ); // return file to original size ftruncate( fd, filesize ); return( 0 ); } int main( int argc, char **argv ) { for ( int ii = 1; ii < argc; ii++ ) { check( argv[ ii ] ); } return( 0 ); } </code></pre> (error checking is omitted for clarity) ZFS with compression enabled doesn't seem to update the file metadata quickly, hence the spinning waiting for the changes to appear. When run on a Solaris 11 box with the files <code>asdf</code> (ZFS filesystem, compression enabled) <code>/tmp/asdf</code> (tmpfs file system), and <code>/var/tmp/asdf</code> (ZFS, no compression), that code produces the following output: <pre class="prettyprint"><code>file: asdf filesystem: zfs blocksize: 131072 size: 2228224 blocks: 10 orig blocks: 1 disk space: 5120 file: /tmp/asdf filesystem: tmpfs blocksize: 4096 size: 69632 blocks: 136 orig blocks: 0 disk space: 69632 file: /var/tmp/asdf filesystem: zfs blocksize: 131072 size: 2228224 blocks: 257 orig blocks: 1 disk space: 131584 </code></pre> From that output, it should be obvious that <code>/tmp/asdf</code> is on a file system that doesn't support sparse files, and <code>/var/tmp/asdf</code> is in a file system that does support such files. And plain <code>asdf</code> is on something else entirely, where writing 128 kB of data adds all of 9 512-byte disk blocks. From that, you can infer that there's some sort of compression going on in the filesystem. Offhand, I suspect it's pretty safe to assume any filesystem that supports such native compression is also going to support sparse files. And the fastest way to determine if a filesystem supports sparse files when give a filename or open file descriptor is to call <code>stat()</code> on the filename or <code>fstat()</code> on the file descriptor, obtain the <code>st_fstype</code> field from the <code>struct stat</code>, and compare the file's filesystem type to a set of strings of filesystem types known to support sparse files.

How to test if sparse file is supported

2 Answers

You can stat() the file to obtain file size and number of disk blocks, seek a relatively small number of disk blocks past the end of the file, write a known number of blocks, then stat the file again. Compare the original number of disk blocks to the final number. Just a few disk blocks shouldn't take too long to write if the file system doesn't support sparse file.

Given the original and final number of disk blocks, then try to determine if the file system supports sparse files. I say "try" because some filesystems can make this hard - for example, ZFS with compression enabled.

Something like this:

#include <unistd.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#include <errno.h>

int check( const char *filename )
{
    struct stat sb;
    long blocksize;
    off_t filesize;
    blkcnt_t origblocks;
    char *buffer;
    int fd;

    fd = open( filename, O_CREAT | O_RDWR, 0644 );

    fstat( fd, &sb );
    blocksize = sb.st_blksize;
    filesize = sb.st_size;
    origblocks = sb.st_blocks;

    lseek( fd, 16UL * blocksize, SEEK_END );

    buffer = malloc( blocksize );
    memset( buffer, 0xAA, blocksize );

    write( fd, buffer, blocksize );
    fsync( fd );

    free( buffer );

    // kludge to give ZFS time to update metadata
    for ( ;; )
    {
        stat( filename, &sb );
        if ( sb.st_blocks != origblocks )
        {
            break;
        }
    }

    printf( "file: %s\n filesystem: %s\n blocksize: %d\n size: %zd\n"
        " blocks: %zd\n orig blocks: %zd\n disk space: %zd\n",
        filename, sb.st_fstype, blocksize, sb.st_size,
        ( size_t ) sb.st_blocks, ( size_t ) origblocks,
        ( size_t ) ( 512UL * sb.st_blocks ) );

    // return file to original size
    ftruncate( fd, filesize );
    return( 0 );
}

int main( int argc, char **argv )
{
    for ( int ii = 1; ii < argc; ii++ )
    {
        check( argv[ ii ] );
    }

    return( 0 );
}

(error checking is omitted for clarity)

ZFS with compression enabled doesn't seem to update the file metadata quickly, hence the spinning waiting for the changes to appear.

When run on a Solaris 11 box with the files asdf (ZFS filesystem, compression enabled) /tmp/asdf (tmpfs file system), and /var/tmp/asdf (ZFS, no compression), that code produces the following output:

file: asdf
 filesystem: zfs
 blocksize: 131072
 size: 2228224
 blocks: 10
 orig blocks: 1
 disk space: 5120
file: /tmp/asdf
 filesystem: tmpfs
 blocksize: 4096
 size: 69632
 blocks: 136
 orig blocks: 0
 disk space: 69632
file: /var/tmp/asdf
 filesystem: zfs
 blocksize: 131072
 size: 2228224
 blocks: 257
 orig blocks: 1
 disk space: 131584

From that output, it should be obvious that /tmp/asdf is on a file system that doesn't support sparse files, and /var/tmp/asdf is in a file system that does support such files.

And plain asdf is on something else entirely, where writing 128 kB of data adds all of 9 512-byte disk blocks. From that, you can infer that there's some sort of compression going on in the filesystem. Offhand, I suspect it's pretty safe to assume any filesystem that supports such native compression is also going to support sparse files.

And the fastest way to determine if a filesystem supports sparse files when give a filename or open file descriptor is to call stat() on the filename or fstat() on the file descriptor, obtain the st_fstype field from the struct stat, and compare the file's filesystem type to a set of strings of filesystem types known to support sparse files.

192

answered Sep 20 '22 00:09

Andrew Henle

This is a highly naive CLI interactive test, but if du and du --apparent are different, you can be pretty sure that the filesystem supports sparse files.

E.g. on an ext4 partition, when I do:

dd seek=1G if=/dev/zero of=f bs=1 count=1 status=none
du --block-size=1 f
du --block-size=1 --apparent f

it gives me:

8192    f
1073741825      f

So the 1GB apparent size file actually only took up 8KB, which implies that a sparse file was created.

See also: why is the output of `du` often so different from `du -b`

answered Sep 21 '22 00:09

Ciro Santilli 新疆再教育营六四事件法轮功郝海东

Related questions
                            
                                Why are stdin and stdout seemingly interchangeable?
                            
                                Cannot clone github repo without being logged in as root, regardless of sudo
                            
                                Building a Python shared object binding with cmake, which depends upon external libraries
                            
                                how to use unix configure to compile 32 bits executable on 64 bits
                            
                                Unix stat()/lstat() for Java
                            
                                Best POSIX way to determine if a filesystem is mounted read only
                            
                                How could I list all virtual machines under Virtual Box?
                            
                                How to get displayed width of a string?
                            
                                When is a shared library considered to be "the same" for the purpose of sharing?
                            
                                How do you use dup2 and fork together?
                            
                                Why does gdb stop at a different line than "i b" shows while returning from function?
                            
                                Creating wordpress dump(.wxr) using unix and mysql
                            
                                Xcode Build Script (Build Phases->Run Script) Increment Build Version based on Username(User)
                            
                                Node.JS: Alternative to chown string
                            
                                Closing opened file descriptors in child process
                            
                                Why must I enter "\\\0" to create a string "\0" in zsh?
                            
                                Debugging shell scripts with line numbers
                            
                                pthread conditions and process termination
                            
                                Creating temporary named fifo in *nix system
                            
                                Multicast IPC options in unix

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to test if sparse file is supported

Tags:

io

unix

posix

filesystems

sparse-file

Siyuan Ren

People also ask

2 Answers

Andrew Henle

Ciro Santilli 新疆再教育营六四事件法轮功郝海东

Recent Activity

Donate For Us