Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C fgets versus fgetc for reading line

Tags:

c

io

stdio

fgets

fgetc

I need to read a line of text (terminated by a newline) without making assumptions about the length. So I now face to possibilities:

  • Use fgets and check each time if the last character is a newline and continuously append to a buffer
  • Read each character using fgetc and occasionally realloc the buffer

Intuition tells me the fgetc variant might be slower, but then again I don't see how fgets can do it without examining every character (also my intuition isn't always that good). The lines are quite large so the performance is important.

I would like to know the pros and cons of each approach. Thank you in advance.

like image 590
nc3b Avatar asked Mar 03 '11 20:03

nc3b


People also ask

Is fgets faster than fgetc?

If you can set a maximum line length, even a large one, then one fgets would do the trick. If not, multiple fgets calls will still be faster than multiple fgetc calls because the overhead of the latter will be greater.

What is the difference between fgetc () and fgets ()?

The fgetc() function returns a single character from an open file. Note: This function is slow and should not be used on large files. If you need to read one character at a time from a large file, use fgets() to read data one line at a time and then process the line one single character at a time with fgetc().

Does fgetc read new line?

The fgets function reads characters from the stream stream up to and including a newline character and stores them in the string s , adding a null character to mark the end of the string.

Does fgets read until newline?

Reads bytes from a stream pointed to by stream into an array pointed to by string, starting at the position indicated by the file position indicator. Reading continues until the number of characters read is equal to n-1, or until a newline character ( \n ), or until the end of the stream, whichever comes first.


2 Answers

I suggest using fgets() coupled with dynamic memory allocation - or you can investigate the interface to getline() that is in the POSIX 2008 standard and available on more recent Linux machines. That does the memory allocation stuff for you. You need to keep tabs on the buffer length as well as its address - so you might even create yourself a structure to handle the information.

Although fgetc() also works, it is marginally fiddlier - but only marginally so. Underneath the covers, it uses the same mechanisms as fgets(). The internals may be able to exploit speedier operation - analogous to strchr() - that are not available when you call fgetc() directly.

like image 82
Jonathan Leffler Avatar answered Sep 29 '22 23:09

Jonathan Leffler


Does your environment provide the getline(3) function? If so, I'd say go for that.

The big advantage I see is that it allocates the buffer itself (if you want), and will realloc() the buffer you pass in if it's too small. (So this means you need to pass in something gotten from malloc()).

This gets rid of some of the pain of fgets/fgetc, and you can hope that whoever wrote the C library that implements it took care of making it efficient.

Bonus: the man page on Linux has a nice example of how to use it in an efficient manner.

like image 24
Mat Avatar answered Sep 30 '22 00:09

Mat