Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Correct way to read a text file into a buffer in C? [duplicate]

Tags:

c

input

buffer

I'm dealing with small text files that i want to read into a buffer while i process them, so i've come up with the following code:

... char source[1000000];  FILE *fp = fopen("TheFile.txt", "r"); if(fp != NULL) {     while((symbol = getc(fp)) != EOF)     {         strcat(source, &symbol);     }     fclose(fp); } ... 

Is this the correct way of putting the contents of the file into the buffer or am i abusing strcat()?

I then iterate through the buffer thus:

for(int x = 0; (c = source[x]) != '\0'; x++) {     //Process chars } 
like image 699
Gary Willoughby Avatar asked Jan 08 '10 16:01

Gary Willoughby


People also ask

Which function reads the character and put them in buffer?

Since fgets() reads input from user, we need to provide input during runtime. Reads characters from the standard input (stdin) and stores them as a C string into str until a newline character or the end-of-file is reached.

What is file buffer in C?

A temporary storage area is called buffer. All input output (I/O) devices contain I/O buffer. When we try to pass more than the required number of values as input then, the remaining values will automatically hold in the input buffer. This buffer data automatically go to the next input functionality, if it is exists.


2 Answers

char source[1000000];  FILE *fp = fopen("TheFile.txt", "r"); if(fp != NULL) {     while((symbol = getc(fp)) != EOF)     {         strcat(source, &symbol);     }     fclose(fp); } 

There are quite a few things wrong with this code:

  1. It is very slow (you are extracting the buffer one character at a time).
  2. If the filesize is over sizeof(source), this is prone to buffer overflows.
  3. Really, when you look at it more closely, this code should not work at all. As stated in the man pages:

The strcat() function appends a copy of the null-terminated string s2 to the end of the null-terminated string s1, then add a terminating `\0'.

You are appending a character (not a NUL-terminated string!) to a string that may or may not be NUL-terminated. The only time I can imagine this working according to the man-page description is if every character in the file is NUL-terminated, in which case this would be rather pointless. So yes, this is most definitely a terrible abuse of strcat().

The following are two alternatives to consider using instead.

If you know the maximum buffer size ahead of time:

#include <stdio.h> #define MAXBUFLEN 1000000  char source[MAXBUFLEN + 1]; FILE *fp = fopen("foo.txt", "r"); if (fp != NULL) {     size_t newLen = fread(source, sizeof(char), MAXBUFLEN, fp);     if ( ferror( fp ) != 0 ) {         fputs("Error reading file", stderr);     } else {         source[newLen++] = '\0'; /* Just to be safe. */     }      fclose(fp); } 

Or, if you do not:

#include <stdio.h> #include <stdlib.h>  char *source = NULL; FILE *fp = fopen("foo.txt", "r"); if (fp != NULL) {     /* Go to the end of the file. */     if (fseek(fp, 0L, SEEK_END) == 0) {         /* Get the size of the file. */         long bufsize = ftell(fp);         if (bufsize == -1) { /* Error */ }          /* Allocate our buffer to that size. */         source = malloc(sizeof(char) * (bufsize + 1));          /* Go back to the start of the file. */         if (fseek(fp, 0L, SEEK_SET) != 0) { /* Error */ }          /* Read the entire file into memory. */         size_t newLen = fread(source, sizeof(char), bufsize, fp);         if ( ferror( fp ) != 0 ) {             fputs("Error reading file", stderr);         } else {             source[newLen++] = '\0'; /* Just to be safe. */         }     }     fclose(fp); }  free(source); /* Don't forget to call free() later! */ 
like image 184
Michael Avatar answered Sep 20 '22 00:09

Michael


Yes - you would probably be arrested for your terriable abuse of strcat !

Take a look at getline() it reads the data a line at a time but importantly it can limit the number of characters you read, so you don't overflow the buffer.

Strcat is relatively slow because it has to search the entire string for the end on every character insertion. You would normally keep a pointer to the current end of the string storage and pass that to getline as the position to read the next line into.

like image 28
Martin Beckett Avatar answered Sep 23 '22 00:09

Martin Beckett