We have a bug in our application that does not occur every time and therefore we don't know its "logic". I don't even get it reproduced in 100 times today. Disclaimer: This bug exists and I've seen it. It's not a pebkac or something similar. What are common hints to reproduce this kind of bug?

There is no general good answer to the question, but here is what I have found: <ol> <li>It takes a talent for this kind of thing. Not all developers are best suited for it, even if they are superstars in other areas. So know your team, who has a talent for it, and hope you can give them enough candy to get them excited about helping you out, even if it isn't their area.</li> <li>Work backwards, and treat it like a scientific investigation. Start with the bug, what you see is wrong. Develop hypotheses about what could cause it (this is the creative/imaginative part, the art that not everyone has the talent for) - and it helps a lot to know how the code works. For each of those hypotheses (preferably sorted by what you think is most likely - again pure gut feel here), develop a test that tries to eliminate it as the cause, and test the hypothesis. Any given failure to meet a prediction doesn't mean the hypothesis is wrong. Test the hypothesis until it is confirmed to be wrong (although as it gets less likely you may want to move on to another hypothesis first, just don't discount this one until you have a definitive failure).</li> <li>Gather as much data as you can during this process. Extensive logging and whatever else is applicable. Do not discount a hypothesis because you lack the data, rather remedy the lack of data. Quite often the inspiration for the right hypothesis comes from examining the data. Noticing something off in a stack trace, weird issue in a log, something missing that should be there in a database, etc.</li> <li>Double check every assumption. So many times I have seen an issue not get fixed quickly because some general method call was not further investigated, so the problem was just assumed to be not applicable. "Oh that, that should be simple." (See point 1).</li> </ol> If you run out of hypotheses, that is generally caused by insufficient knowledge of the system (this is true even if you wrote every line of code yourself), and you need to run through and review code and gain additional insight into the system to come up with a new idea. Of course, none of the above guarantees anything, but that is the approach that I have found gets results consistently.

How do you reproduce bugs that occur sporadically?

2 Answers

Analyze the problem in a pair and pair-read the code. Make notes of the problems you KNOW to be true and try to assert which logical preconditions must hold true for this happen. Follow the evidence like a CSI.

Most people instinctively say "add more logging", and this may be a solution. But for a lot of problems this just makes things worse, since logging can change timing-dependencies sufficiently to make the problem more or less frequent. Changing the frequency from 1 in 1000 to 1 in 1,000,000 will not bring you closer to the true source of the problem.

So if your logical reasoning does not solve the problem, it'll probably give you a few specifics you could investigate with logging or assertions in your code.

175

answered Oct 01 '22 23:10

krosenvold

There is no general good answer to the question, but here is what I have found:

It takes a talent for this kind of thing. Not all developers are best suited for it, even if they are superstars in other areas. So know your team, who has a talent for it, and hope you can give them enough candy to get them excited about helping you out, even if it isn't their area.
Work backwards, and treat it like a scientific investigation. Start with the bug, what you see is wrong. Develop hypotheses about what could cause it (this is the creative/imaginative part, the art that not everyone has the talent for) - and it helps a lot to know how the code works. For each of those hypotheses (preferably sorted by what you think is most likely - again pure gut feel here), develop a test that tries to eliminate it as the cause, and test the hypothesis. Any given failure to meet a prediction doesn't mean the hypothesis is wrong. Test the hypothesis until it is confirmed to be wrong (although as it gets less likely you may want to move on to another hypothesis first, just don't discount this one until you have a definitive failure).
Gather as much data as you can during this process. Extensive logging and whatever else is applicable. Do not discount a hypothesis because you lack the data, rather remedy the lack of data. Quite often the inspiration for the right hypothesis comes from examining the data. Noticing something off in a stack trace, weird issue in a log, something missing that should be there in a database, etc.
Double check every assumption. So many times I have seen an issue not get fixed quickly because some general method call was not further investigated, so the problem was just assumed to be not applicable. "Oh that, that should be simple." (See point 1).

If you run out of hypotheses, that is generally caused by insufficient knowledge of the system (this is true even if you wrote every line of code yourself), and you need to run through and review code and gain additional insight into the system to come up with a new idea.

Of course, none of the above guarantees anything, but that is the approach that I have found gets results consistently.

answered Oct 01 '22 22:10

Yishai

Related questions
                            
                                Using Chrome JavaScript Debugger / How to break on page loading events
                            
                                Is it possible to view the contents of files in the iOS Application sandbox while debugging?
                            
                                safari and chrome javascript console multiline
                            
                                How to print call stack in Swift?
                            
                                How to debug a Gradle build.gradle file (in a debugger, with breakpoints)?
                            
                                How can you debug JavaScript which hangs the browser?
                            
                                Is it possible to get a core dump of a running process and its symbol table?
                            
                                How to change the default browser to debug with in Visual Studio 2008?
                            
                                Android device is not connected to USB for debugging (Android studio)
                            
                                IDE and Debugger for node.js [closed]
                            
                                Debugging Content Scripts for Chrome Extension
                            
                                Programmatically stop JavaScript execution in Firebug
                            
                                Why assert is not largely used?
                            
                                IIS does not list a web site that matches the launched URL
                            
                                Why can't a Flutter application connect to the Internet when installing "app-release.apk"? But it works normally in debug mode
                            
                                Is there a way to detect if a debugger is attached to another process from C#?
                            
                                What is the best way to test and interact with inner functions defined inside a toplevel function?
                            
                                Visual Studio debugger tips & tricks for .NET
                            
                                segfault only when NOT using debugger
                            
                                Debug.WriteLine in release build

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How do you reproduce bugs that occur sporadically?

Tags:

language-agnostic

logging

debugging

guerda

People also ask

2 Answers

krosenvold

Yishai

Recent Activity

Donate For Us