Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Converting C source to C++

How would you go about converting a reasonably large (>300K), fairly mature C codebase to C++?

The kind of C I have in mind is split into files roughly corresponding to modules (i.e. less granular than a typical OO class-based decomposition), using internal linkage in lieu private functions and data, and external linkage for public functions and data. Global variables are used extensively for communication between the modules. There is a very extensive integration test suite available, but no unit (i.e. module) level tests.

I have in mind a general strategy:

  1. Compile everything in C++'s C subset and get that working.
  2. Convert modules into huge classes, so that all the cross-references are scoped by a class name, but leaving all functions and data as static members, and get that working.
  3. Convert huge classes into instances with appropriate constructors and initialized cross-references; replace static member accesses with indirect accesses as appropriate; and get that working.
  4. Now, approach the project as an ill-factored OO application, and write unit tests where dependencies are tractable, and decompose into separate classes where they are not; the goal here would be to move from one working program to another at each transformation.

Obviously, this would be quite a bit of work. Are there any case studies / war stories out there on this kind of translation? Alternative strategies? Other useful advice?

Note 1: the program is a compiler, and probably millions of other programs rely on its behaviour not changing, so wholesale rewriting is pretty much not an option.

Note 2: the source is nearly 20 years old, and has perhaps 30% code churn (lines modified + added / previous total lines) per year. It is heavily maintained and extended, in other words. Thus, one of the goals would be to increase mantainability.

[For the sake of the question, assume that translation into C++ is mandatory, and that leaving it in C is not an option. The point of adding this condition is to weed out the "leave it in C" answers.]

like image 307
Barry Kelly Avatar asked Oct 14 '08 00:10

Barry Kelly


2 Answers

Having just started on pretty much the same thing a few months ago (on a ten-year-old commercial project, originally written with the "C++ is nothing but C with smart structs" philosophy), I would suggest using the same strategy you'd use to eat an elephant: take it one bite at a time. :-)

As much as possible, split it up into stages that can be done with minimal effects on other parts. Building a facade system, as Federico Ramponi suggested, is a good start -- once everything has a C++ facade and is communicating through it, you can change the internals of the modules with fair certainty that they can't affect anything outside them.

We already had a partial C++ interface system in place (due to previous smaller refactoring efforts), so this approach wasn't difficult in our case. Once we had everything communicating as C++ objects (which took a few weeks, working on a completely separate source-code branch and integrating all changes to the main branch as they were approved), it was very seldom that we couldn't compile a totally working version before we left for the day.

The change-over isn't complete yet -- we've paused twice for interim releases (we aim for a point-release every few weeks), but it's well on the way, and no customer has complained about any problems. Our QA people have only found one problem that I recall, too. :-)

like image 132
Head Geek Avatar answered Oct 13 '22 00:10

Head Geek


What about:

  1. Compiling everything in C++'s C subset and get that working, and
  2. Implementing a set of facades leaving the C code unaltered?

Why is "translation into C++ mandatory"? You can wrap the C code without the pain of converting it into huge classes and so on.

like image 44
Federico A. Ramponi Avatar answered Oct 13 '22 02:10

Federico A. Ramponi