Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do you approach intermittent bugs? [closed]

Scenario

You've got several bug reports all showing the same problem. They're all cryptic with similar tales of how the problem occurred. You follow the steps but it doesn't reliably reproduce the problem. After some investigation and web searching, you suspect what might be going on and you are pretty sure you can fix it.

Problem

Unfortunately, without a reliable way to reproduce the original problem, you can't verify that it actually fixes the issue rather than having no effect at all or exacerbating and masking the real problem. You could just not fix it until it becomes reproducible every time, but it's a big bug and not fixing it would cause your users a lot of other problems.

Question

How do you go about verifying your change?

I think this is a very familiar scenario to anyone who has engineered software, so I'm sure there are a plethora of approaches and best practices to tackling bugs like this. We are currently looking at one of these problems on our project where I have spent some time determining the issue but have been unable to confirm my suspicions. A colleague is soak-testing my fix in the hopes that "a day of running without a crash" equates to "it's fixed". However, I'd prefer a more reliable approach and I figured there's a wealth of experience here on SO.

like image 376
Jeff Yates Avatar asked Dec 09 '08 14:12

Jeff Yates


People also ask

How do you deal with inconsistent bugs?

If you can't reproduce a bug, you first document the steps and repeat them under different environment to find it again. And if the bug is unable to reappear, then you should ask for more information from customer/client regarding the bug reproduciblity.

What is your approach when defect is not reproducible?

First step: By using some sort of remote software, you let the customer tell you what to do to reproduce the problem on the system that has it. If this fails, then close it. Second step: Try to reproduce the problem on another system. If this fails, make an exact copy of the customers system.


1 Answers

Bugs that are hard to reproduce are the hardest one to solve. What you need to make sure that you have found the root of the problem, even if the problem itself cannot be reproduced successfully.

The most common intermittent bugs are caused by race-conditions - by eliminating the race, or ensuring that one side always wins you have eliminated the root of the problem even if you can't successfully confirm it by testing the results. The only thing you can test is that the cause does need repeat itself.

Sometimes fixing what is seen as the root indeed solves a problem but not the right one - there is no avoiding it. The best way to avoid intermittent bugs is be careful and methodical with the system design and architecture.

like image 125
Eran Galperin Avatar answered Sep 21 '22 07:09

Eran Galperin