Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tool for model checking large, distributed C++ projects such as KDE?

Is there a tool which can handle model checking large, real-world, mostly-C++, distributed systems, such as KDE?

(KDE is a distributed system in the sense that it uses IPC, although typically all of the processes are on the same machine. Yes, by the way, this is a valid usage of "distributed system" - check Wikipedia.)

The tool would need to be able to deal with intraprocess events and inter-process messages.

(Let's assume that if the tool supports C++, but doesn't support other stuff that KDE uses such as moc, we can hack something together to workaround that.)

I will happily accept less general (e.g. static analysers specialised for finding specific classes of bugs) or more general static analysis alternatives, in lieu of actual model checkers. But I am only interested in tools that can actually handle projects of the size and complexity of KDE.

like image 882
Robin Green Avatar asked Nov 10 '10 06:11

Robin Green


2 Answers

You're obviously looking for a static analysis tool that can

  • parse C++ on scale
  • locate code fragments of interest
  • extract a model
  • pass that model to a model checker
  • report that result to you

A significant problem is that everybody has a different idea about what model they'd like to check. That alone likely kills your chance of finding exactly what you want, because each model extraction tool has generally made a choice as to what it wants to capture as a model, and the chances that it matches what you want precisely are IMHO close to zero.

You aren't clear on what specifically you want to model, but I presume you want to find the communication primitives and model the process interactions to check for something like deadlock?

The commercial static analysis tool vendors seem like a logical place to look, but I don't think they are there, yet. Coverity would seem like the best bet, but it appears they only have some kind of dynamic analysis for Java threading issues.

This paper claims to do this, but I have not looked at in any detail: Compositional analysis of C/C++ programs with VeriSoft. Related is [PDF] Computer-Assisted Assume/Guarantee Reasoning with VeriSoft. It appears you have to hand-annotate the source code to indicate the modelling elements of interest. The Verifysoft tool itself appears to be proprietary to Bell Labs and is likely hard to obtain.

Similarly this one: Distributed Verification of Multi-threaded C++ Programs .

This paper also makes interesting claims, but doesn't process C++ in spite of the title: Runtime Model Checking of Multithreaded C/C++ Programs.

While all the parts of this are difficult, an issue they all share is parsing C++ (as exemplified by the previously quoted paper) and finding the code patterns that provide the raw information for the model. You also need to parse the specific dialect of C++ you are using; its not nice that the C++ compilers all accept different languages. And, as you have observed, processing large C++ codes is necessary. Model checkers (SPIN and friends) are relatively easy to find.

Our DMS Software Reengineering Toolkit provides for general purpose parsing, with customizable pattern matching and fact extraction, and has a robust C++ Front End that handles many dialects of C++ (EDIT Feb 2019: including C++17 in Ansi, GCC and MS flavors). It could likely be configured to find and extract the facts that correspond to the model you care about. But it doesn't do this this off the shelf.

DMS with its C front end have been used to process extremely large C applications (19,000 compilation units!). The C++ front end has been used in anger on a variety of large-scale C++ projects (EDIT Feb 2019: including large scale refactoring of APIs across 3000+ compilation units). Given DMS's general capability, I think it likely capable of handling fairly big chunks of code. YMMV.

like image 61
Ira Baxter Avatar answered Nov 06 '22 03:11

Ira Baxter


Static code analyzers when used against large code base first time usually produce so many warnings and alerts that you won't be able to analyze all of them in reasonable amount of time. It is hard to single out real problems from code that just look suspicious to a tool.

You can try to use automatic invariant discovery tools like "Daikon" that capture perceived invariants at run time. You can validate later if discovered invariants (equivalence of variables "a == b+1" for example) make sense and then insert permanent asserts into your code. This way when invariant is violated as result of your change you will get a warning that perhaps you broke something by your change. This method helps to avoid restructuring or changing your code to add tests and mocks.

like image 29
Alexei Polkhanov Avatar answered Nov 06 '22 02:11

Alexei Polkhanov