Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can I use /dev/sda just as an ordinary sequential file?

Tags:

c++

linux

I need to get better bulk write performance that I am able to do by using the partitioned and formatted SSD device with ext4 filesystem. When I benchmark with dd command, I get somewhat 20 % of improvement of

time dd if=/dev/zero of=/dev/sdb count=1024 bs=1048576

in comparison to just

time dd if=/dev/zero of=/mnt/test.img count=1024 bs=1048576 && sync

where /mnt is my mounted /dev/sda1.

Assuming the hard drive is dedicated exclusively to my application and I can set permissions for it, can I simply open the /dev/sda from my C++ application and use it as an ordinary file? I mean, write data from beginning, then open again and read:

  ofstream myfile;
  myfile.open ("/dev/sda");
  myfile << "Writing this to a file.\n";
  myfile.close();

and then reopen and read in the same spirit. If there is not clear where there is the end of my writing, I can write the end of data marker myself.

I would assume yes because it is expected to behave like a file. However I would like to check if there are no significant hidden problems with it.

like image 344
h23 Avatar asked Apr 16 '19 10:04

h23


People also ask

Is Dev SDA a file?

/dev/sda is formally a file because in Unix/Linux almost everything is a file. Normally it's a block special file associated with a block device. You can call /dev/sda a "device node".

What is the difference between Dev SDA and Dev sda1?

The disk names in Linux are alphabetical. /dev/sda is the first hard drive (the primary master), /dev/sdb is the second etc. The numbers refer to partitions, so /dev/sda1 is the first partition of the first drive.

What is stored in dev SDA?

In this article, we have learned that /dev/sda is the hard disk of the computer we are using. Also, we now know that sd[a-z] is the currently used naming format for our disks in Linux. And lastly, /dev/sda[1-15] shows the partitions within our hard disk.

What does Dev SDA mean in Linux?

dev/sda - The first SCSI disk SCSI ID address-wise. dev/sdb - The second SCSI disk address-wise and so on. dev/scd0 or /dev/sr0 - The first SCSI CD-ROM. dev/hda - The primary disk on IDE primary controller. dev/hdb - The secondary disk on IDE primary controller.


1 Answers

/dev/sda typically represents a block device. Contrast that with, eg, /dev/tty (a character device) or /dev/zero (another character device), /proc/self/fd/0 (a pseudo-file), or (for example) /home/inetknght/file, a regular file.

Different devices have different characteristics. Block devices read and write in blocks. The size of the block is dependent on the device itself. That might be emulated though; eg, you might have a disk image file added via hypervisor, and hypervisor emulates the block accessibility of it. A lot of block devices expose block sizes of 512 bytes or 4K bytes. Some block devices are wrappers; like the hypervisor, or also like a RAID setup. Both will often configure a separate block size better suited to the controller's performance.

Contrast that with normal files which are usually simple data streams with an associated size. A file stream written on a block device has a lot of behind-the-scenes activity to translate between the two: how many blocks b are needed for data of size n? That's what a filesystem does: generally translate blocks of data by allocating however many blocks are necessary for the size of the file, possibly by over-allocating. Additional metadata about that is stored in the filesystem data tree which populates separate blocks on the device.

The performance improvement you're seeing is likely to be the removal of the filesystem. Filesystems often have some (sometimes significant) overhead of use, but they simplify the lower level stuff they're built upon, such as block devices. Simple code is far easier to maintain. Using a different filesystem will give you different performance characteristics. So you might not need the added complexity from going to a lower level.

You might be able to write to a block device as if you were writing to streaming device. If the underlying device is truly a block device though, then what will happen when you write a number of bytes which aren't divisible by the block size of the device? Suppose the block size is 512 bytes (fairly typical, so is 4K) and you write 500 bytes. What will the device do with the other 12 bytes? That depends on the device: it might overwrite with zeroes, it might leave alone, it might actually have written your data to a block-sized cache location and then those 12 bytes get whatever was in cache from a previous block in the same cache location. This is just one example of simplifications that filesystems provide.

So: you've expressed a question about how raw device files work. You've also said that you've got full access to the machine. I think the best way for you to learn would be to just play with it and see what you discover.

I happen to be in the middle of setting up a RAID in my spare time using some drives in USB enclosures. Not exactly ideal, but I think it's fun. I'll demonstrate some basic functionality. If I damage something, I'll just wipe it later. ;)

firefly@firefly:~$ ls -lah /dev/sd*
brw-rw---- 1 root disk 8,  0 Apr 16 11:53 /dev/sda
brw-rw---- 1 root disk 8, 16 Apr 16 11:53 /dev/sdb
brw-rw---- 1 root disk 8, 32 Apr 16 11:54 /dev/sdc
brw-rw---- 1 root disk 8, 48 Apr 16 11:54 /dev/sdd

The four devices in my not-yet-setup-raid. I'll pick on /dev/sda here.

file command is pretty handy for discovering general information about various files.

firefly@firefly:~$ file /dev/sda
/dev/sda: block special (8/0)

...but it tells me nothing special about this file.

Touch will tell me if I can write to the file.

firefly@firefly:~$ touch /dev/sda
touch: cannot touch '/dev/sda': Permission denied

You already knew you need special permissions to write to it. Glad I don't care about this machine, so I'll just drop right into root and try again. It's generally bad practice to run as root, but I'm on a system I literally don't care about and will wipe in my spare time anyway.

firefly@firefly:~$ sudo su -
root@firefly:~# touch /dev/sda
root@firefly:~# echo $?
0
root@firefly:~# ls -lah /dev/sd*
brw-rw---- 1 root disk 8,  0 Apr 18 04:45 /dev/sda
brw-rw---- 1 root disk 8, 16 Apr 16 11:53 /dev/sdb
brw-rw---- 1 root disk 8, 32 Apr 16 11:54 /dev/sdc
brw-rw---- 1 root disk 8, 48 Apr 16 11:54 /dev/sdd

Updated timestamp, and of course root is able to write to it. A little bit of googleing and I discover there's a command /sbin/blockdev to let me read/write some block device ioctls.

That sounds cool.

root@firefly:~# blockdev --getiomin /dev/sda
4096
root@firefly:~# blockdev --getioopt /dev/sda
33553920
root@firefly:~# blockdev --getbsz /dev/sda
4096

Nice! So I've discovered that my block device has a block size of 4K (indicated by blockdev --getbsz and supported by blockdev --getiomin). I'm not sure about that --getioopt reporting just under 32MiB to be the optimal IO size. That's kinda weird. I'm not going to worry about it.

Okay, let's step back for a moment.

dd on the other hand copies blocks of information. That's perfect for block devices! But your question about treating the block device as a file would be better suited by actually treating it as a file. So stop using dd.

What do I get if I read raw data from the device? Remember raw data is garbled on a text console, so I'll pipe it through xxd to provide a hexdump.

root@firefly:~# head -c 100 /dev/sda | xxd
00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000020: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000030: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000040: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000050: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000060: 0000 0000                                ....

So here's some secret sauce: head normally reads the first 10 lines. I changed it to read the first 100 bytes. Since the drive was freshly formatted to zeroes, head by itself would have read the entire device because it does not contain a single newline. That would have taken hours (it's an 8TB spinning disk).

So let's have some fun with this super-large "file":

root@firefly:~# echo "hello world" > /dev/sda && head -c 16 /dev/sda | xxd
00000000: 6865 6c6c 6f20 776f 726c 640a 0000 0000  hello world.....

Neat. Echoing to the device overwrote the first zeroes with hello world. Echo's not quite dd, so that sounds fun.

root@firefly:~# echo "goodbye" > /dev/sda && head -c 16 /dev/sda | xxd
00000000: 676f 6f64 6279 650a 726c 640a 0000 0000  goodbye.rld.....

You can see writing "goodbye" overwrote only part of hello world. That's fine; I expected it. You should be aware of your block device's behavior: it might have overwritten everything else in the same block with zeroes.

Clearly bash and echo seem to work just fine with the device file. I wonder about other languages? Your question is tagged with [C++] so let's try that:

root@firefly:~# g++ -x c++ -std=c++17 - <<EOF
> #include <cerrno>
> #include <cstdlib>
> #include <cstring>
> #include <fstream>
> #include <iostream>
> 
> int main(){
>     std::fstream f{"/dev/sda", std::ios_base::binary};
>     if ( false == f.good() ){
>         // C++ standard library does not let you inspect _why_ a failure occurred
>         // to get that we would have to use ::open() and check errno.
>         auto err = errno;
>         std::cerr << "unable to open /dev/sda: " << err << ": " << strerror(err) << std::endl;
>         std::cerr << f.good() << f.bad() << f.eof() << f.fail() << std::endl;
>         return EXIT_FAILURE;
>     }
>     std::cout << "opened!" << std::endl;
>     return EXIT_SUCCESS;
> }
> EOF
root@firefly:~# ./a.out 
unable to open /dev/sda: 0: Success
0001

There's a bit of information here. First: compiling an application, providing the source code using a bash heredoc. That's good to know for Linux users and developers. If you're unfamiliar with it, then you can un-quote everything between EOF, save to a file, and compile that.

However, the important bit is that opening the file using std::fstream failed. Whoa now! We saw echo worked just fine! Why the difference?! I suspect it goes back to what I said about block devices being different. But I don't know the answer to that. I suspect that getting errno will tell me more information. Let's try that:

root@firefly:~# g++ -x c++ -std=c++17 - <<EOF
> #include <cerrno>
> #include <cstdio>
> #include <cstdlib>
> #include <cstring>
> #include <fstream>
> #include <functional>
> #include <iostream>
> #include <memory>
> 
> using FILEPTR = std::unique_ptr<std::FILE, decltype(&::std::fclose)>;
> 
> int main(){
>     FILEPTR f{nullptr, &::std::fclose};
>     // Remember, C-style has no concept of text mode vs binary mode.
>     f.reset(std::fopen("/dev/sda", "w+"));
>     if ( nullptr == f ){
>         auto err = errno;
>         std::cerr << "unable to open /dev/sda: " << err << ": " << strerror(err) << std::endl;
>         return EXIT_FAILURE;
>     }
>     std::cout << "opened!" << std::endl;
>     return EXIT_SUCCESS;
> }
> EOF
root@firefly:~# ./a.out 
opened!

Whoa, wait a minute: that worked. So std::fstream could not open the block device but std::fopen() could?! That honestly doesn't make a lot of sense to me. Hopefully someone else can help out here. But I imagine this should point you in the right direction. I'll leave you with a quick read/write example:

root@firefly:~# g++ -x c++ -std=c++17 - <<EOF
> extern "C" {
> #include <unistd.h>
> } // extern "C"
> 
> #include <algorithm>
> #include <array>
> #include <cerrno>
> #include <cstdio>
> #include <cstdlib>
> #include <cstring>
> #include <fstream>
> #include <functional>
> #include <iostream>
> #include <memory>
> #include <string_view>
> 
> using FILEPTR = std::unique_ptr<std::FILE, decltype(&::std::fclose)>;
> 
> int main(){
>     FILEPTR f{nullptr, &::std::fclose};
>     // Remember, C-style has no concept of text mode vs binary mode.
>     f.reset(std::fopen("/dev/sda", "w+"));
>     if ( nullptr == f ){
>         auto err = errno;
>         std::cerr << "unable to open /dev/sda: " << err << ": " << strerror(err) << std::endl;
>         return EXIT_FAILURE;
>     }
>     std::cout << "opened!" << std::endl;
> 
>     std::cout << "ftell(): " << std::ftell(f.get()) << '\n';
>     if ( 0 != std::fseek(f.get(), 0, SEEK_END) ) {
>         auto err = errno;
>         std::cerr << "unable to fseek(): " << err << ": " << std::strerror(err) << std::endl;
>         return EXIT_FAILURE;
>     }
>     std::cout << "ftell(SEEK_END): " << std::ftell(f.get()) << '\n';
>     std::rewind(f.get());
> 
>     // I thought about putting it on the stack, but it might exceed stack
>     // size on some platforms.
>     using buffer_type = std::array<char, 4096>;
>     using bufferptr = std::unique_ptr<buffer_type>;
>     bufferptr buffer = std::make_unique<buffer_type>();
>     if (gethostname(buffer->data(), buffer->size()) < 0) {
>         // using string_view to ensure the null byte gets written
>         auto s = std::string_view{"unable to get hostname\0"};
>         std::fwrite(s.data(), 1u, s.size(), f.get());
>     } else {
>         // ugh. boost::asio makes this simpler but I'll leave it to you to figure out.
>         if ( buffer->end() == std::find(buffer->begin(), buffer->end(), '\0') ){
>             std::cout << "buffer truncated" << std::endl;
>             buffer->back() = '\0';
>         }
>         std::fwrite(buffer->data(), 1u, buffer->size(), f.get());
>     }
>     if ( 0 != std::fflush(f.get()) ) {
>         int err = errno;
>         std::cerr << "fflush() failed: " << err << ": " << std::strerror(err) << std::endl;
>         return EXIT_FAILURE;
>     }
>     std::rewind(f.get());
> 
>     // reset our local internal buffer
>     std::fill(buffer->begin(), buffer->end(), '\0');
> 
>     // read into it
>     std::fread(buffer->data(), 1u, buffer->size(), f.get());
> 
>     // find where the disk's zeroes start. if we truncated, then it should start
>     // literally on the last byte in teh buffer, since we set that manually.
>     std::string_view read_message{buffer->data(), (std::size_t)std::distance(buffer->begin(), std::find(buffer->begin(), buffer->end(), '\0'))};
>     std::cout << read_message << std::endl;
> 
>     return EXIT_SUCCESS;
> }
> EOF
root@firefly:~# ./a.out 
opened!
ftell(): 0
ftell(SEEK_END): 8001563222016
firefly

Perfect. So it was able to discover the drive advertises 8TB but is closer to 7.2TiB (that's marketing departments love the difference between Terabyte and Tebibyte) . I was able to successfully write and read-back the system hostname using C++. And I've (briefly) touched on some information for you to learn about performance tuning block devices. I am curious about what kind of performance you get out of std::FILE*, or if you discover something different.

You're going into a level low enough where it will likely get harder to find simple answers to questions. What other limitations are there when directly using a block device? I'm pretty sure (though not 100%) that the C++ standard library is dealing with my read/write not aligning to the disk's block size (via std::FILE*). That's cool. But it leaves me wondering: how can I turn that off to try to get even more performance? My first guess would be to use ::open(), ::read(), ::write(), etc with native file descriptors. That would throw away a lot of syntactic sugar that's already been well-tested; I'm not sure I'd want to reinvent the wheel here. Indeed, the manual page for ::open() specifically calls out some information related to dealing with block devices, such as buffering (which could also be what's handling the block alignment issues, but I am not sure).

So the tl;dr is that it's complicated. Yes, you can read/write to it (given sufficient permissions). No, not everything works "right" if you're expecting it to work like a regular file. Specifically, it seems std::fstream might not work with block devices, but std::FILE* does. And specifically, you will need to manually deal with framing your data. And if you use C-level IO functions, it will no doubt work but will have even more limitations or performance complications. This whole reply assumes you're using Linux; a different OS could have different behavior. And of course different block devices may also have different behavior (I'm using spinning rust but you mentioned using an SSD).

like image 179
inetknght Avatar answered Sep 22 '22 13:09

inetknght