I've written an Apache module in C. Under certain conditions, I can get it to segfault, but I have no idea as to why. At this point, it could be my code, it could be the way I'm compiling the program, or it could be a bug in the OS library (the segfault happens during a call to dlopen()).
I've tried running through GDB and Valgrind with no success. GDB gives me a backtrace into the dlopen() system call that appears meaningless. In Valgrind, the bug actually seems to disappear or at least become non-reproducible. On the other hand, I'm a total novice when it comes to these tools.
I'm a little new to production quality C programming (I started on C many years ago, but have never worked professionally with it.) What is the best way for me to go about learning the ropes of debugging programs? What other tools should I be investigating? In summary, how do you figure out how to tackle new bug challenges?
EDIT: Just to clarify, I want to thank Sydius's and dmckee's input. I had taken a look at Apache's guide and am fairly familiar with dlopen (and dlsym and dlclose). My module works for the most part (it's at about 3k lines of code and, as long as I don't activate this one section, things seem to work just fine.)
I guess this is where my original question comes from - I don't know what to do next. I know I haven't used GDB and Valgrind to their full potential. I know that I may not be compiling with the exact right flags. But I'm having trouble figuring out more. I can find beginner's guides that tell me what I already know, and man pages that tell me more than I need to know but with no guidance.
Isolate the source of the bug. Identify the cause of the bug. Determine a fix for the bug. Apply the fix and test it.
Unfortunately the GNU tools are not the best, and my experience is that the dynamic linker muddies the waters enormously. If you can get Apache to link statically with your module that will enable gdb especially to perform more reliably. I don't know how easy that is; a lot depends on the Apache build system.
It's worrisome but not shocking that you can't easily reproduce the bug with valgrind.
Regarding compiling with the right flags, both valgrind and gdb will give you much better information if you compile everything in sight with -g -O0
. Don't believe the claims on the gcc man page that gcc -g -O
is good enough; it isn't---even -O
will cause variables in the source code to be eliminated by the optimizer.
I'm sure that debugging techniques are in general language independent and there is no such think "C debugging".
There is a lot of different tool that can help you find simple problems like memory leak, or just stupid mistakes in the code, some times it even can catch simple memory overruns.
But for real hard to find problems like problems originated from multitasking/interrupt, dma memory corruption the only tool is your brain and well written code (with thinking in advance that this code will be debugged). You can find more about preparing your code to debugging here. It seems from Sydius post that Apache already have a good tracing mechanism in place, so just use it and add simalar to your code base.
In additional i would say that another important step in debugging is "don't assume/think". Base all your steps on bare facts, prove all your assumption with 100% accuracy before you making another step based on that assumption. Basing your debugging on assumption usually will bring you to wrong direction.
Edit after Dave's clarification:
You next step should be find the smallest part of the code that cause the problem. You sad that if your disable certain section the module is loaded. just make this section as small is possible, remove/moke everything in the section until you will find ideally one line that will cause the module not to load. And after you find this line. it will be an exact time to start using your brain :) Just don't forget to 100% verify that this is the line.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With