Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting Control Flow Graph from ANSI C code

Tags:

c

graph

gcc

ansi

I'm building tool for testing ansi c applications. Simply load code, view control flow graph, run test, mark all vertexes which was hit. I'm trying to build CFG all by myself from parsing code. Unfortunately It gets messed up if code is nested. GCC gives ability to get CFG from compiled code. I might write parser for its output, but I need line numbers for setting breakpoints. Is there way for getting line numbers when outputting Control Flow Graph with -fdump-tree-cfg or -fdump-tree-vcg?

like image 522
Eloar Avatar asked May 06 '13 07:05

Eloar


People also ask

What is the control flow graph of a program?

A Control Flow Graph (CFG) is the graphical representation of control flow or computation during the execution of programs or applications. Control flow graphs are mostly used in static analysis as well as compiler applications, as they can accurately represent the flow inside of a program unit.

What is control flow graph in code optimization explain?

A control flow graph depicts how the program control is being passed among the blocks. It is a useful tool that helps in optimization by help locating any unwanted loops in the program.


2 Answers

For the control flow graph of a C Program you could look at existing Python parsers for C:

  • PyCParser
  • pycparser
  • pyclibrary (fork of pyclibrary )
  • joern
  • CoFlo C/C++ control flow graph generator and analyzer

Call graphs are a closely related construct to control flow graphs. There are several approaches available to create call graphs (function dependencies) for C code. This might prove of help for progressing with control flow graph generation. Ways to create dependency graphs in C:

  • Using cflow:

  • cflow +pycflow2dot +dot (GPL, BSD) cflow is robust, because it can handle code which cannot compile, e.g. missing includes. If preprocessor directives are heavily used, it may need the --cpp option to preprocess the code.

  • cflow + cflow2dot + dot (GPL v2, GPL v3, Eclipse Public License (EPL) v1) (note that cflow2dot needs some path fixing before it works)

  • cflow +cflow2dot.bash (GPL v2, ?)

  • cflow +cflow2vcg (GPL v2 , GPL v2)

  • enhanced cflow (GPL v2) with list to exclude symbols from graph

  • Using cscope:

  • cscope (BSD)

  • cscope +callgraphviz +dot +xdot

  • cscope +vim CCTree (C Call-Tree Explorer)

  • cscope +ccglue

  • cscope +CodeQuery for C, C++, Python & Java

  • cscope +Python html producer

  • cscope +calltree.sh

  • ncc (cflow like)

  • KCachegrind (KDE dependency viewer)

  • Calltree

The following tools unfortunately require that the code be compilable, because they depend on output from gcc:

  • CodeViz (GPL v2) (weak point: needs compilable source, because it uses gcc to dump cdepn files)
  • gcc +egypt +dot (GPL v*, Perl = GPL | Artistic license, EPL v1) (egypt uses gcc to produce RTL, so fails for any buggy source code, or even in case you just want to focus on a single file from a larger project. Therefore, it is not very useful compared to the more robust cflow-based toolchains. Note that egypt has by default good support for excluding library calls from the graph, to make it cleaner.

Also, file dependency graphs for C/C++ can be created with crowfood.

like image 70
Ioannis Filippidis Avatar answered Sep 19 '22 12:09

Ioannis Filippidis


So I've made some more research and it is not hard to get line numbers for nodes. Just add lineno option to one of those options to get it. So use -fdump-tree-cfg-lineno or -fdump-tree-vcg-lineno. It took me some time to check if those numbers are reliable. In case of graph in VCG format label of each node contains two numbers. Those are line numbers for start and end of code portion represented by this node.

like image 43
Eloar Avatar answered Sep 17 '22 12:09

Eloar