In all the code I see online, programs are always broken up into many smaller files. For all of my projects for school though, I've gotten by by just having one gigantic C source file that contains all of structs and functions I use.
What I want to learn how to do is split my program up into smaller files, which seems to be the standard professionally. (Why is this, by the way -- is it just for ease of reading?)
I've searched around, and all I can find information on is building libraries, which isn't what I want to do I don't think. I wish I could be more helpful, but I'm not totally sure about how to implement this -- I'm only sure about the end product I want.
is it just for ease of reading?
The main reasons are
Maintainability: In large, monolithic programs like what you describe, there's a risk that changing code in one part of the file can have unintended effects somewhere else. Back at my first job, we were tasked with speeding up code that drove a 3D graphical display. It was a single, monolithic, 5000+-line main
function (not that big in the grand scheme of things, but big enough to be a headache), and every change we made broke an execution path somewhere else. This was badly written code all the way around (goto
s galore, literally hundreds of separate variables with incredibly informative names like nv001x
, program structure that read like old-school BASIC, micro-optimizations that didn't do anything but make the code that much harder to read, brittle as hell) but keeping it all in one file made the bad situation worse. We eventually gave up and told the customer we'd either have to rewrite the whole thing from scratch, or they'd have to buy faster hardware. They wound up buying faster hardware.
Reusability: There's no point in writing the same code over and over again. If you come up with a generally useful bit of code (like, say, an XML parsing library, or a generic container), keep it in its own separately compiled source files, and simply link it in when necessary.
Testability: Breaking functions out into their own separate modules allows you to test those functions in isolation from the rest of the code; you can verify each individual function more easily.
Buildability: Okay, so "buildability" isn't a real word, but rebuilding an entire system from scratch every time you change one or two lines can be time consuming. I've worked on very large systems where complete builds could take upwards of several hours. By breaking up your code, you limit the amount of code that has to be rebuilt. Not to mention that any compiler is going to have some limits on the size of the file it can handle. That graphical driver I mentioned above? The first thing we tried to do to speed it up was to compile it with optimizations turned on (starting with O1). The compiler ate up all available memory, then it ate all the available swap until the kernel panicked and brought down the entire system. We literally could not build that code with any optimization turned on (this was back in the days when 128 MB was a lot of very expensive memory). Had that code been broken up into multiple files (hell, just multiple functions within the same file), we wouldn't have had that problem.
Parallel Development: There isn't an "ability" word for this, but by breaking source up into multiple files and modules, you can parallelize development. I work on one file, you work on another, someone else works on a third, etc. We don't risk stepping on each other's code that way.
Well, that's exactly what you want : split your code in several libraries !
Let's take an example, in one file you have :
#include <stdio.h>
int something() {
return 42;
}
int bar() {
return something();
}
void foo(int i) {
printf("do something with %d\n", i);
}
int main() {
foo(bar());
return 0;
}
you can split this up to :
mylib.h:
#ifndef __MYLIB_H__
#define __MYLIB_H__
#include <stdio.h>
int bar();
void foo();
#endif
N.B.: the preprocessor code above is called a "guard" which is used to not run twice this header file, so you can call the same include at several places, and have no compilation error
mylib.c:
#include <mylib.h>
int something() {
return 42;
}
int bar() {
return something();
}
void foo(int i) {
printf("do something with %d\n", i);
}
myprog.c:
#include <mylib.h>
int main() {
foo(bar());
return 0;
}
to compile it you do :
gcc -c mylib.c -I./
gcc -o myprog myprog.c -I./ mylib.o
now the advantages ?
is it just for ease of reading?
No, it can also save you a lot of time compiling; when you change one source file, you only recompile that file, then relink, instead of recompiling everything. But the main point is dividing a program into a set of well-separated modules that are easier to understand and maintain than a single monolithic "blob".
For starters, try to adhere to Rob Pike's rule that "data dominates": design your program around a bunch of data structures (struct
's, usually) with operations on them. Put all the operations that belong to a single data structure into a separate module. Make all functions static
that need not be called by functions outside the module.
Easy of reading is one point of breaking up files, but another is that when you build a project containing multiple files (header and source files) a good build system will only rebuild the files that have been modified thereby shortening build-times.
As for how to break up a monolithic file into multiple files, there are many ways to go. Speaking for me, I would try to group functionality, so for example all input handling is put in one source file, output in another, and functions that are used by many different function in a third source file. I would do the same with structures/constants/macros, group related structures/etc. in separate header files. I would also mark functions used only in a single source file as static
, so they can't be used from other source files by mistake.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With