Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can an expression be too long in a c program for gcc to compile?

Let's say I input a very long equation on to a single line of c code (either a .c or .h file) that is thousands (perhaps tens of thousands) of characters long; for example

y = (2*(36*pow(x,2)*pow(A[n][j],5)*B[n][j]
  + (several thousand more such expressions) ) ;

(here just take x to be a variable, A, B to be double pointers, etc). Is there a limit for how long a line of code can be in a .c or .h file before say the gcc compiler is unable to correctly compile the code? I've read several related discussions about this issue for #c, but not for just plain c. I have never received any errors from gcc about having too long of lines in my code, but I'd like to be extra sure about this point.

EDIT: In response to some of the below comments, I now realize that I was asking two (I think closely related) questions:

(1) Is there any limit to how long a line can be in c before the gcc compiler could potentially make an error/raise an error?

(2) Is there any limit to how complex an expression can be before the gcc compiler could potentially make an error/raise an error? (e.g. we could break up a very long line into several lines but it's all a part of the same expression).

like image 482
physics_researcher Avatar asked Dec 18 '22 03:12

physics_researcher


2 Answers

The actual upper limit for "how long a line of code can be in a .c or .h file " is highly implementation dependent, but the lower limit is specified in standard. According to C11, §5.2.4.1

The implementation shall be able to translate and execute at least one program that contains at least one instance of every one of the following limits:

  • 4095 characters in a logical source line

That said, as Keith mentioned in other answer, the logical line and the complexity of the statement / expression (involving the number of operations / operands involved, type of operations, nested expression etc) are not the same thing. There are separate minimal recommendations too, like

  • 63 nesting levels of parenthesized expressions within a full expression

  • 511 identifiers with block scope declared in one block

etc.

In process of computing a complex expression, multiple intermediate results must be stored temporarily and theoretically, it may use up all available stack space in your system, creating a problem. In practice, that's something really far-fetched unless the expression is so complex that it cannot be accommodated in today's multi-gig computing systems.


With all that said, probably you need to write such code only once, that is never. As said by M. Fowler and I quote,

Any fool can write code that a computer can understand. Good programmers write code that humans can understand.

like image 52
Sourav Ghosh Avatar answered Dec 28 '22 23:12

Sourav Ghosh


You've asked about two separate things: the maximum length of a line, and the complexity of an expression. An arbitrarily complex expression can easily be split across multiple lines -- as you did in your example.

The C standard requires implementations to support at least 4095 characters in a logical source line. The way it expresses that requirement is rather indirect. A compiler must be able to process one program that hits all the specified limits. The rationale is that the standard specifies the requirement in a precise and testable way, but the easiest way to meet the requirement is to avoid imposing any fixed limits at all.

The details are in N1570 5.2.4.1, "Translation limits". The relevant limits in that section are 63 nesting levels of parentheses and 127 arguments in a function call -- but you can create an arbitrarily complex expression without hitting either of those limits.

The standard imposes no specific limits on the complexity of an expression. Most compilers, including gcc, will allocate resources (particularly memory) dynamically as they're processing source code. The internal representation of an expression is likely to be a dynamically allocated tree structure, not a fixed-size array.

You can probably construct an expression that's too complex for gcc to handle, and it will probably respond either by printing a fatal error message when it's unable to allocate memory, or just by choking with a segmentation fault or something similar. On a modern computer with gigabytes of memory, you'd need a very large expression to trigger such a failure.

You're not going to run into this issue unless you're generating C code automatically, and your generator gets out of hand.

like image 24
Keith Thompson Avatar answered Dec 28 '22 22:12

Keith Thompson