dd: How to calculate optimal blocksize? [closed]

People also ask

What is a good BS for dd?

The best block size was 8388608 : 3.582 GB bytes/sec. CORRECTION: I've run the second script, testing read performance, on a 2015 rMBP with 512GB SSD. The best block size was 524288 (5.754 GB/sec). The second-best block size was 131072 (5.133 GB/sec).

What is the optimal block size?

Since size of a transaction is taken as 1.2 Kb, hence the results show that an optimal block size is of 255 Kbytes when network bandwidth of miners varies from 250 kbps to 1200 kbps. This shall achieve lower block transmission time and block composition time.

What does BS 1M mean?

bs= sets the blocksize, for example bs=1M would be 1MiB blocksize. count= copies only this number of blocks (the default is for dd to keep going forever or until the input runs out).

Does block size matter dd?

Using the blocksize option on dd effectively specifies how much data will get copied to memory from the input I/O sub-system before attempting to write back to the output I/O sub-system.

The optimal block size depends on various factors, including the operating system (and its version), and the various hardware buses and disks involved. Several Unix-like systems (including Linux and at least some flavors of BSD) define the st_blksize member in the struct stat that gives what the kernel thinks is the optimal block size:

#include <sys/stat.h>
#include <stdio.h>

int main(void)
{
    struct stat stats;

    if (!stat("/", &stats))
    {
        printf("%u\n", stats.st_blksize);
    }
}

The best way may be to experiment: copy a gigabyte with various block sizes and time that. (Remember to clear kernel buffer caches before each run: echo 3 > /proc/sys/vm/drop_caches).

However, as a rule of thumb, I've found that a large enough block size lets dd do a good job, and the differences between, say, 64 KiB and 1 MiB are minor, compared to 4 KiB versus 64 KiB. (Though, admittedly, it's been a while since I did that. I use a mebibyte by default now, or just let dd pick the size.)

As others have said, there is no universally correct block size; what is optimal for one situation or one piece of hardware may be terribly inefficient for another. Also, depending on the health of the disks it may be preferable to use a different block size than what is "optimal".

One thing that is pretty reliable on modern hardware is that the default block size of 512 bytes tends to be almost an order of magnitude slower than a more optimal alternative. When in doubt, I've found that 64K is a pretty solid modern default. Though 64K usually isn't THE optimal block size, in my experience it tends to be a lot more efficient than the default. 64K also has a pretty solid history of being reliably performant: You can find a message from the Eug-Lug mailing list, circa 2002, recommending a block size of 64K here: http://www.mail-archive.com/[email protected]/msg12073.html

For determining THE optimal output block size, I've written the following script that tests writing a 128M test file with dd at a range of different block sizes, from the default of 512 bytes to a maximum of 64M. Be warned, this script uses dd internally, so use with caution.

dd_obs_test.sh:

#!/bin/bash

# Since we're dealing with dd, abort if any errors occur
set -e

TEST_FILE=${1:-dd_obs_testfile}
TEST_FILE_EXISTS=0
if [ -e "$TEST_FILE" ]; then TEST_FILE_EXISTS=1; fi
TEST_FILE_SIZE=134217728

if [ $EUID -ne 0 ]; then
  echo "NOTE: Kernel cache will not be cleared between tests without sudo. This will likely cause inaccurate results." 1>&2
fi

# Header
PRINTF_FORMAT="%8s : %s\n"
printf "$PRINTF_FORMAT" 'block size' 'transfer rate'

# Block sizes of 512b 1K 2K 4K 8K 16K 32K 64K 128K 256K 512K 1M 2M 4M 8M 16M 32M 64M
for BLOCK_SIZE in 512 1024 2048 4096 8192 16384 32768 65536 131072 262144 524288 1048576 2097152 4194304 8388608 16777216 33554432 67108864
do
  # Calculate number of segments required to copy
  COUNT=$(($TEST_FILE_SIZE / $BLOCK_SIZE))

  if [ $COUNT -le 0 ]; then
    echo "Block size of $BLOCK_SIZE estimated to require $COUNT blocks, aborting further tests."
    break
  fi

  # Clear kernel cache to ensure more accurate test
  [ $EUID -eq 0 ] && [ -e /proc/sys/vm/drop_caches ] && echo 3 > /proc/sys/vm/drop_caches

  # Create a test file with the specified block size
  DD_RESULT=$(dd if=/dev/zero of=$TEST_FILE bs=$BLOCK_SIZE count=$COUNT conv=fsync 2>&1 1>/dev/null)

  # Extract the transfer rate from dd's STDERR output
  TRANSFER_RATE=$(echo $DD_RESULT | \grep --only-matching -E '[0-9.]+ ([MGk]?B|bytes)/s(ec)?')

  # Clean up the test file if we created one
  if [ $TEST_FILE_EXISTS -ne 0 ]; then rm $TEST_FILE; fi

  # Output the result
  printf "$PRINTF_FORMAT" "$BLOCK_SIZE" "$TRANSFER_RATE"
done

View on GitHub

I've only tested this script on a Debian (Ubuntu) system and on OSX Yosemite, so it will probably take some tweaking to make work on other Unix flavors.

By default the command will create a test file named dd_obs_testfile in the current directory. Alternatively, you can provide a path to a custom test file by providing a path after the script name:

$ ./dd_obs_test.sh /path/to/disk/test_file

The output of the script is a list of the tested block sizes and their respective transfer rates like so:

$ ./dd_obs_test.sh
block size : transfer rate
       512 : 11.3 MB/s
      1024 : 22.1 MB/s
      2048 : 42.3 MB/s
      4096 : 75.2 MB/s
      8192 : 90.7 MB/s
     16384 : 101 MB/s
     32768 : 104 MB/s
     65536 : 108 MB/s
    131072 : 113 MB/s
    262144 : 112 MB/s
    524288 : 133 MB/s
   1048576 : 125 MB/s
   2097152 : 113 MB/s
   4194304 : 106 MB/s
   8388608 : 107 MB/s
  16777216 : 110 MB/s
  33554432 : 119 MB/s
  67108864 : 134 MB/s

(Note: The unit of the transfer rates will vary by OS)

To test optimal read block size, you could use more or less the same process, but instead of reading from /dev/zero and writing to the disk, you'd read from the disk and write to /dev/null. A script to do this might look like so:

dd_ibs_test.sh:

#!/bin/bash

# Since we're dealing with dd, abort if any errors occur
set -e

TEST_FILE=${1:-dd_ibs_testfile}
if [ -e "$TEST_FILE" ]; then TEST_FILE_EXISTS=$?; fi
TEST_FILE_SIZE=134217728

# Exit if file exists
if [ -e $TEST_FILE ]; then
  echo "Test file $TEST_FILE exists, aborting."
  exit 1
fi
TEST_FILE_EXISTS=1

if [ $EUID -ne 0 ]; then
  echo "NOTE: Kernel cache will not be cleared between tests without sudo. This will likely cause inaccurate results." 1>&2
fi

# Create test file
echo 'Generating test file...'
BLOCK_SIZE=65536
COUNT=$(($TEST_FILE_SIZE / $BLOCK_SIZE))
dd if=/dev/urandom of=$TEST_FILE bs=$BLOCK_SIZE count=$COUNT conv=fsync > /dev/null 2>&1

# Header
PRINTF_FORMAT="%8s : %s\n"
printf "$PRINTF_FORMAT" 'block size' 'transfer rate'

# Block sizes of 512b 1K 2K 4K 8K 16K 32K 64K 128K 256K 512K 1M 2M 4M 8M 16M 32M 64M
for BLOCK_SIZE in 512 1024 2048 4096 8192 16384 32768 65536 131072 262144 524288 1048576 2097152 4194304 8388608 16777216 33554432 67108864
do
  # Clear kernel cache to ensure more accurate test
  [ $EUID -eq 0 ] && [ -e /proc/sys/vm/drop_caches ] && echo 3 > /proc/sys/vm/drop_caches

  # Read test file out to /dev/null with specified block size
  DD_RESULT=$(dd if=$TEST_FILE of=/dev/null bs=$BLOCK_SIZE 2>&1 1>/dev/null)

  # Extract transfer rate
  TRANSFER_RATE=$(echo $DD_RESULT | \grep --only-matching -E '[0-9.]+ ([MGk]?B|bytes)/s(ec)?')

  printf "$PRINTF_FORMAT" "$BLOCK_SIZE" "$TRANSFER_RATE"
done

# Clean up the test file if we created one
if [ $TEST_FILE_EXISTS -ne 0 ]; then rm $TEST_FILE; fi

View on GitHub

An important difference in this case is that the test file is a file that is written by the script. Do not point this command at an existing file or the existing file will be overwritten with zeroes!

For my particular hardware I found that 128K was the most optimal input block size on a HDD and 32K was most optimal on a SSD.

Though this answer covers most of my findings, I've run into this situation enough times that I wrote a blog post about it: http://blog.tdg5.com/tuning-dd-block-size/ You can find more specifics on the tests I performed there.

I've found my optimal blocksize to be 8 MB (equal to disk cache?) I needed to wipe (some say: wash) the empty space on a disk before creating a compressed image of it. I used:

cd /media/DiskToWash/
dd if=/dev/zero of=zero bs=8M; rm zero

I experimented with values from 4K to 100M.

After letting dd to run for a while I killed it (Ctlr+C) and read the output:

36+0 records in
36+0 records out
301989888 bytes (302 MB) copied, 15.8341 s, 19.1 MB/s

As dd displays the input/output rate (19.1MB/s in this case) it's easy to see if the value you've picked is performing better than the previous one or worse.

My scores:

bs=   I/O rate
---------------
4K    13.5 MB/s
64K   18.3 MB/s
8M    19.1 MB/s <--- winner!
10M   19.0 MB/s
20M   18.6 MB/s
100M  18.6 MB/s

Note: To check what your disk cache/buffer size is, you can use sudo hdparm -i /dev/sda

This is totally system dependent. You should experiment to find the optimum solution. Try starting with bs=8388608. (As Hitachi HDDs seems to have 8MB cache.)

Related questions
                            
                                Remove redundant paths from $PATH variable
                            
                                How do I change bash history completion to complete what's already on the line?
                            
                                Make $JAVA_HOME easily changable in Ubuntu [closed]
                            
                                In Python script, how do I set PYTHONPATH?
                            
                                How can I find out a file's MIME type (Content-Type)?
                            
                                Counter increment in Bash loop not working
                            
                                Hiding user input on terminal in Linux script
                            
                                How do I pass a command line argument while starting up GDB in Linux? [duplicate]
                            
                                Ruby: Installing rmagick on Ubuntu
                            
                                Automatically run a program on startup under Linux Ubuntu [closed]
                            
                                How to check if smtp is working from commandline (Linux) [closed]
                            
                                Can you attach Amazon EBS to multiple instances?
                            
                                How to copy commits from one Git repo to another?
                            
                                Are tar.gz and tgz the same thing? [closed]
                            
                                Compress files while reading data from STDIN
                            
                                Bash script to calculate time elapsed
                            
                                What is the difference between mutex and critical section?
                            
                                How can I calculate an MD5 checksum of a directory?
                            
                                Writing programs to cope with I/O errors causing lost writes on Linux
                            
                                unix domain socket VS named pipes?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

dd: How to calculate optimal blocksize? [closed]

Tags:

linux

dd

People also ask

Recent Activity

Donate For Us