I need to read a line of text (terminated by a newline) without making assumptions about the length. So I now face to possibilities:
fgets
and check each time if the last character is a newline and continuously append to a bufferfgetc
and occasionally realloc
the bufferIntuition tells me the fgetc
variant might be slower, but then again I don't see how fgets
can do it without examining every character (also my intuition isn't always that good). The lines are quite large so the performance is important.
I would like to know the pros and cons of each approach. Thank you in advance.
If you can set a maximum line length, even a large one, then one fgets would do the trick. If not, multiple fgets calls will still be faster than multiple fgetc calls because the overhead of the latter will be greater.
The fgetc() function returns a single character from an open file. Note: This function is slow and should not be used on large files. If you need to read one character at a time from a large file, use fgets() to read data one line at a time and then process the line one single character at a time with fgetc().
The fgets function reads characters from the stream stream up to and including a newline character and stores them in the string s , adding a null character to mark the end of the string.
Reads bytes from a stream pointed to by stream into an array pointed to by string, starting at the position indicated by the file position indicator. Reading continues until the number of characters read is equal to n-1, or until a newline character ( \n ), or until the end of the stream, whichever comes first.
I suggest using fgets()
coupled with dynamic memory allocation - or you can investigate the interface to getline()
that is in the POSIX 2008 standard and available on more recent Linux machines. That does the memory allocation stuff for you. You need to keep tabs on the buffer length as well as its address - so you might even create yourself a structure to handle the information.
Although fgetc()
also works, it is marginally fiddlier - but only marginally so. Underneath the covers, it uses the same mechanisms as fgets()
. The internals may be able to exploit speedier operation - analogous to strchr()
- that are not available when you call fgetc()
directly.
Does your environment provide the getline(3)
function? If so, I'd say go for that.
The big advantage I see is that it allocates the buffer itself (if you want), and will realloc()
the buffer you pass in if it's too small. (So this means you need to pass in something gotten from malloc()
).
This gets rid of some of the pain of fgets/fgetc, and you can hope that whoever wrote the C library that implements it took care of making it efficient.
Bonus: the man page on Linux has a nice example of how to use it in an efficient manner.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With