To read lines from a file there are the <code>getline()</code> and <code>fgets()</code> POSIX functions (ignoring the dreaded <code>gets()</code>). It is common sense that <code>getline()</code> is preferred over <code>fgets()</code> because it allocates the line buffer as needed. My question is: Isn’t that dangerous? What if by accident or malicious intent someone creates a 100GB file with no <code>'\n'</code> byte in it – won’t that make my <code>getline()</code> call allocate an insane amount of memory?

<blockquote> My question is: Isn’t that dangerous? What if by accident or malicious intent someone creates a 100GB file with no '\n' byte in it – won’t that make my getline() call allocate an insane amount of memory? </blockquote> Yes, what you describe is a plausible risk. However, <ul> <li>if the program requires loading an entire line into memory at once, then allowing <code>getline()</code> to attempt to do that is not inherently more risky than writing your own code to do it with <code>fgets()</code>; and</li> <li>if you have a program that has such a vulnerability, then you can mitigate the risk by using <code>setrlimit()</code> to limit the total amount of (virtual) memory it can reserve. This can be used to cause it to fail instead of successfully allocating enough memory to interfere with the rest of the system.</li> </ul> Best overall, I'd argue, is to write code that does not require input in units of full lines (all at once) in the first place, but such an approach has its own complexities.

getline() vs. fgets(): Control memory allocation

Tags:

c

posix

To read lines from a file there are the getline() and fgets() POSIX functions (ignoring the dreaded gets()). It is common sense that getline() is preferred over fgets() because it allocates the line buffer as needed.

My question is: Isn’t that dangerous? What if by accident or malicious intent someone creates a 100GB file with no '\n' byte in it – won’t that make my getline() call allocate an insane amount of memory?

962

asked May 03 '19 12:05

edavid

1 Answers

My question is: Isn’t that dangerous? What if by accident or malicious intent someone creates a 100GB file with no '\n' byte in it – won’t that make my getline() call allocate an insane amount of memory?

Yes, what you describe is a plausible risk. However,

if the program requires loading an entire line into memory at once, then allowing getline() to attempt to do that is not inherently more risky than writing your own code to do it with fgets(); and
if you have a program that has such a vulnerability, then you can mitigate the risk by using setrlimit() to limit the total amount of (virtual) memory it can reserve. This can be used to cause it to fail instead of successfully allocating enough memory to interfere with the rest of the system.

Best overall, I'd argue, is to write code that does not require input in units of full lines (all at once) in the first place, but such an approach has its own complexities.

answered Sep 22 '22 05:09

John Bollinger

Related questions
                            
                                Is while(1); undefined behavior in C?
                            
                                Why is glibc's sscanf vastly slower than fscanf on Linux?
                            
                                Compile multiple C files with make
                            
                                Function Returning Itself
                            
                                What does __VA_ARGS__ in a macro mean?
                            
                                How to portably find out min(INT_MAX, abs(INT_MIN))?
                            
                                How can I avoid "duplicate symbol" errors in xcode with shared static libraries?
                            
                                Unsigned modulos: alternative approach?
                            
                                What is the _REENTRANT flag?
                            
                                JNI Calls different in C vs C++?
                            
                                Simple Linux Signal Handling
                            
                                Does gcc automatically initialize static variables to zero?
                            
                                Get called function name as string
                            
                                Converting Between Local Times and GMT/UTC in C/C++
                            
                                C Programming: Debugging with pthreads
                            
                                UNIX Portable Atomic Operations
                            
                                Printing chars and their ASCII-code in C
                            
                                Are literal strings and function return values lvalues or rvalues?
                            
                                Is a malloc() needed before a realloc()?
                            
                                undefined reference to `log'

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With