I'd like to give a try at copying the contents of a file over to another one by using memory mapped I/O in Linux via mmap()
. The intention is to check by myself if that's better than using fread()
and fwrite()
and how would it deal with big files (like couple of GiBs for example, since the file is read whole I want to know if I need to have such amount of memory for it).
This is the code I'm working with right now:
// Open original file descriptor:
int orig_fd = open(argv[1], O_RDONLY);
// Check if it was really opened:
if (orig_fd == -1) {
fprintf(stderr, "ERROR: File %s couldn't be opened:\n", argv[1]);
fprintf(stderr, "%d - %s\n", errno, strerror(errno));
exit(EX_NOINPUT);
}
// Idem for the destination file:
int dest_fd = open(argv[2], O_WRONLY | O_CREAT | O_TRUNC, 0644);
// Check if it was really opened:
if (dest_fd == -1) {
fprintf(stderr, "ERROR: File %s couldn't be opened:\n", argv[2]);
fprintf(stderr, "%d - %s\n", errno, strerror(errno));
// Close original file descriptor too:
close(orig_fd);
exit(EX_CANTCREAT);
}
// Acquire file size:
struct stat info = {0};
if (fstat(orig_fd, &info)) {
fprintf(stderr, "ERROR: Couldn't get info on %s:\n", argv[1]);
fprintf(stderr, "%d - %s\n", errno, strerror(errno));
// Close file descriptors:
close(orig_fd);
close(dest_fd);
exit(EX_IOERR);
}
// Set destination file size:
if (ftruncate(dest_fd, info.st_size)) {
fprintf(stderr, "ERROR: Unable to set %s file size:\n", argv[2]);
fprintf(stderr, "%d - %s\n", errno, strerror(errno));
// Close file descriptors:
close(orig_fd);
close(dest_fd);
exit(EX_IOERR);
}
// Map original file and close its descriptor:
char *orig = mmap(NULL, info.st_size, PROT_READ, MAP_PRIVATE, orig_fd, 0);
if (orig == MAP_FAILED) {
fprintf(stderr, "ERROR: Mapping of %s failed:\n", argv[1]);
fprintf(stderr, "%d - %s\n", errno, strerror(errno));
// Close file descriptors:
close(orig_fd);
close(dest_fd);
exit(EX_IOERR);
}
close(orig_fd);
// Map destination file and close its descriptor:
char *dest = mmap(NULL, info.st_size, PROT_WRITE, MAP_SHARED, dest_fd, 0);
if (dest == MAP_FAILED) {
fprintf(stderr, "ERROR: Mapping of %s failed:\n", argv[2]);
fprintf(stderr, "%d - %s\n", errno, strerror(errno));
// Close file descriptors and unmap first file:
munmap(orig, info.st_size);
close(dest_fd);
exit(EX_IOERR);
}
close(dest_fd);
// Copy file contents:
int i = info.st_size;
char *read_ptr = orig, *write_ptr = dest;
while (--i) {
*write_ptr++ = *read_ptr++;
}
// Unmap files:
munmap(orig, info.st_size);
munmap(dest, info.st_size);
I think it may be a way of doing it but I keep getting an error trying to map the destination file, concretely code 13 (permission denied).
I don't have a clue on why is it failing, I can write to that file since the file gets created and all and the file I'm trying to copy is just a couple of KiBs in size.
Can anybody spot the problem? How come I had permission to map the original file but not the destination one?
NOTE: If anyone is to use the loop to copy bytes posted in the question instead of memcpy
for example, the loop condition should be i--
instead to copy all contents. Thanks to jxh for spotting that.
The mmap() function establishes a mapping between a process' address space and a stream file. The address space of the process from the address returned to the caller, for a length of len, is mapped onto a stream file starting at offset off.
Yes, mmap creates a mapping. It does not normally read the entire content of whatever you have mapped into memory. If you wish to do that you can use the mlock/mlockall system call to force the kernel to read into RAM the content of the mapping, if applicable.
mmap is great if you have multiple processes accessing data in a read only fashion from the same file, which is common in the kind of server systems I write. mmap allows all those processes to share the same physical memory pages, saving a lot of memory.
The mmap() function asks to map 'length' bytes starting at offset 'offset' from the file (or other object) specified by the file descriptor fd into memory, preferably at address 'start'. Sepcifically, for the last argument: 'offset' should be a multiple of the page size as returned by getpagesize(2).
From the mmap()
man page:
EACCES
A file descriptor refers to a non-regular file. Or MAP_PRIVATE was requested, but fd is not open for reading. Or MAP_SHARED was requested and PROT_WRITE is set, but fd is not open in read/write (O_RDWR) mode. Or PROT_WRITE is set, but the file is append-only.
You are opening your destination file with O_WRONLY
. Use O_RDWR
instead.
Also, you should use memcpy
to copy the memory rather than using your own loop:
memcpy(dest, orig, info.st_size);
Your loop has an off by 1 bug.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With