Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optimize my read() loop C (two loops in one)

Tags:

c++

c

linux

unix

I need to read files and store them in mainbuff and mainbuff2.

I should use only syscalls like open(),read(),write(), etc.

I don't want to store them in stack,what if it will be very large? Heap alloc is better.

this code works:

...
    char charbuf;
    char *mainbuff1=malloc(100);
    char *mainbuff2=malloc(100);
    while (read(file1, &charbuf, 1)!=0)
            mainbuff1[len++]=charbuf;
    while (read(file2, &charbuf, 1)!=0)
            mainbuff2[len2++]=charbuf;
...

But mainbuff is only 100 chars. Better solution is alloc mainbuff after counting chars in file like this:

...
    char charbuf;
    while (read(file1, &charbuf, 1)!=0)
            len++;
    while (read(file2, &charbuf, 1)!=0)
            len2++;
    char *mainbuff1=malloc(len);
    char *mainbuff2=malloc(len2);
...

and then again repeat while loop and read bytes into mainbuff.

But 2 loops(first will read and count and second will read) will be non-efficient and slow for large files. Need to do it in one or something else more efficient. Please,help! Have no idea!

like image 486
Alex Zern Avatar asked Nov 30 '22 21:11

Alex Zern


2 Answers

You can use fstat to get the file size instead of reading twice.

#include <sys/stat.h>

int main() {
    struct stat sbuf;
    int fd = open("filename", O_RDWR);
    fstat(fd, &sbuf);
    char *buf = malloc(sbuf.st_size + 1);
}

But, really, the time to worry about efficiency is after it works too slowly.

like image 188
luser droog Avatar answered Dec 03 '22 11:12

luser droog


If this is indeed a place where optimizations are needed, then what you really should optimize is the following two things:

  • buffer allocation
  • number of calls to read() and write()

For small buffers of 100 to 1000 bytes, there's no reason to use malloc() and the like, just allocate the buffer on the stack, it's going to be the fastest. Unless, of course, you want to return pointers to these buffers from the function, in which case you probably should use malloc(). Otherwise, you should consider using global/static arrays instead of dynamically allocated ones.

As for the I/O calls, call read() and write() with the entire buffer size. Don't call them to read or write single bytes. Transitions to the kernel and back do have cost.

Further, if you expect to need to work with fairly large files in RAM, consider using file mapping.

like image 41
Alexey Frunze Avatar answered Dec 03 '22 10:12

Alexey Frunze