complexity of parsing C++

1 Answers

I think the term "parsing" is being interpreted by different people in different ways for the purposes of the question.

In a narrow technical sense, parsing is merely verifying the the source code matches the grammar (or perhaps even building a tree).

There's a rather widespread folk theorem that says you cannot parse C++ (in this sense) at all because you must resolve the meaning of certain symbols to parse. That folk theorem is simply wrong.

It arises from the use of "weak" (LALR or backtracking recursive descent) parsers, which, if they commit to the wrong choice of several possible subparse of a locally ambiguous part of text (this SO thread discusses an example), fail completely by virtue of sometimes making that choice. The way those that use such parser resolve the dilemma is collect symbol table data as parsing occurs and mash extra checking into the parsing process to force the parser to make the right choice when such choice is encountered. This works at the cost of significantly tangling name and type resolution with parsing, which makes building such parsers really hard. But, at least for legacy GCC, they used LALR which is linear time on parsing and I don't think that much more expensive if you include the name/type capture that the parser does (there's more to do after parsing because I don't think they do it all).

There are at least two implementations of C++ parsers done using "pure" GLR parsing technology, which simply admits that the parse may be locally ambiguous and captures the multiple parses without comment or significant overhead. GLR parsers are linear time where there are no local ambiguities. They are more expensive in the ambiguity region, but as a practical matter, most the of source text in a standard C++ program falls into the "linear time" part. So the effective rate is linear, even capturing the ambiguities. Both of the implemented parsers do name and type resolution after parsing and use inconsistencies to eliminate the incorrect ambiguous parses. (The two implementations are Elsa and our (SD's) C++ Front End. I can't speak for Elsa's current capability (I don't think it has been updated in years), but ours does all of C++11 [EDIT Jan 2015: now full C++14 EDIT Oct 2017: now full C++17] including GCC and Microsoft variants).

If you take the hard computer science definition that a language is extensionally defined as an arbitrary set of strings (Grammars are supposed to be succinct ways to encode that intensionally) and treating parsing as "check the the syntax of the program is correct" then for C++ you have expand the templates to verify that each actually expands completely. There's a Turing machine hiding in the templates, so in theory checking that a C++ program is valid is impossible (no time limits). Real compilers (honoring the standard) place fixed constraints on how much template unfolding they'll do, and so does real memory, so in practice C++ compilers finish. But they can take arbitrarily long to "parse" a program in this sense. And I think that's the answer most people care about.

As a practical matter, I'd guess most templates are actually pretty simple, so C++ compilers can finish as fast as other compilers on average. Only people crazy enough to write Turing machines in templates pay a serious price. (Opinion: the price is really the conceptual cost of shoehorning complicated things onto templates, not the compiler execution cost.)

141

answered Sep 29 '22 11:09

Ira Baxter

Related questions
                            
                                How do you convert a Visual Studio project from using wide strings to ordinary strings
                            
                                Does C++ do value initialization of a POD typedef?
                            
                                trick question regarding declaration syntax in C++
                            
                                How to create some class from dll(constructor in dll) in C++?
                            
                                Running C++ binaries without the runtime redistributable (Server2k3, XPSP3)
                            
                                Marking library functions as deprecated/unusable without modifying their source code
                            
                                Using Qt with DirectX?
                            
                                Extract the return type of a function without calling it (using templates?)
                            
                                When do you overload operator new? [duplicate]
                            
                                How do I create a resource dll
                            
                                Storing function pointers
                            
                                If changing a const object is undefined behavior then how do constructors and destructors operate with write access?
                            
                                Program crashes only in Release mode outside debugger
                            
                                At what exact moment is a local variable allocated storage?
                            
                                Equivalent to window.setTimeout() for C++
                            
                                How to monitor/show progress during a C++ sort
                            
                                Software product pricing/cost estimation
                            
                                Using Qt signals and slots with multiple inheritance
                            
                                How can I get my very large program to link?
                            
                                Why is this comparison always true?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

complexity of parsing C++

Tags:

c++

big-o

parsing

theory

compiler-theory

Cpa

People also ask

1 Answers

Ira Baxter

Recent Activity

Donate For Us