Overview. In this article, we'll have a look at integrating MongoDB, a very popular NoSQL open source database with a standalone Java client. MongoDB is written in C++ and has quite a number of solid features such as map-reduce, auto-sharding, replication, high availability etc.
The MongoDB Async driver provides an asynchronous API that can leverage either Netty or Java 7's AsynchronousSocketChannel for fast and non-blocking IO.
After inspecting the history of that line, my main conclusion is that there has been some incompetent programming at work.
That line is gratuitously convoluted. The general form
a? true : b
for boolean a, b
is equivalent to the simple
a || b
The surrounding negation and excessive parentheses convolute things further. Keeping in mind De Morgan's laws it is a trivial observation that this piece of code amounts to
if (!_ok && Math.random() <= 0.1)
return res;
The commit that originally introduced this logic had
if (_ok == true) {
_logger.log( Level.WARNING , "Server seen down: " + _addr, e );
} else if (Math.random() < 0.1) {
_logger.log( Level.WARNING , "Server seen down: " + _addr );
}
—another example of incompetent coding, but notice the reversed logic: here the event is logged if either _ok
or in 10% of other cases, whereas the code in 2. returns 10% of the times and logs 90% of the times. So the later commit ruined not only clarity, but correctness itself.
I think in the code you have posted we can actually see how the author intended to transform the original if-then
somehow literally into its negation required for the early return
condition. But then he messed up and inserted an effective "double negative" by reversing the inequality sign.
Coding style issues aside, stochastic logging is quite a dubious practice all by itself, especially since the log entry does not document its own peculiar behavior. The intention is, obviously, reducing restatements of the same fact: that the server is currently down. The appropriate solution is to log only changes of the server state, and not each its observation, let alone a random selection of 10% such observations. Yes, that takes just a little bit more effort, so let's see some.
I can only hope that all this evidence of incompetence, accumulated from inspecting just three lines of code, does not speak fairly of the project as a whole, and that this piece of work will be cleaned up ASAP.
https://github.com/mongodb/mongo-java-driver/commit/d51b3648a8e1bf1a7b7886b7ceb343064c9e2225#commitcomment-3315694
11 hours ago by gareth-rees:
Presumably the idea is to log only about 1/10 of the server failures (and so avoid massively spamming the log), without incurring the cost of maintaining a counter or timer. (But surely maintaining a timer would be affordable?)
Add a class member initialized to negative 1:
private int logit = -1;
In the try block, make the test:
if( !ok && (logit = (logit + 1 ) % 10) == 0 ) { //log error
This always logs the first error, then every tenth subsequent error. Logical operators "short-circuit", so logit only gets incremented on an actual error.
If you want the first and tenth of all errors, regardless of the connection, make logit class static instead of a a member.
As had been noted this should be thread safe:
private synchronized int getLogit() {
return (logit = (logit + 1 ) % 10);
}
In the try block, make the test:
if( !ok && getLogit() == 0 ) { //log error
Note: I don't think throwing out 90% of the errors is a good idea.
I have seen this kind of thing before.
There was a piece of code that could answer certain 'questions' that came from another 'black box' piece of code. In the case it could not answer them, it would forward them to another piece of 'black box' code that was really slow.
So sometimes previously unseen new 'questions' would show up, and they would show up in a batch, like 100 of them in a row.
The programmer was happy with how the program was working, but he wanted some way of maybe improving the software in the future, if possible new questions were discovered.
So, the solution was to log unknown questions, but as it turned out, there were 1000's of different ones. The logs got too big, and there was no benefit of speeding these up, since they had no obvious answers. But every once in a while, a batch of questions would show up that could be answered.
Since the logs were getting too big, and the logging was getting in the way of logging the real important things he got to this solution:
Only log a random 5%, this will clean up the logs, whilst in the long run still showing what questions/answers could be added.
So, if an unknown event occurred, in a random amount of these cases, it would be logged.
I think this is similar to what you are seeing here.
I did not like this way of working, so I removed this piece of code, and just logged these messages to a different file, so they were all present, but not clobbering the general logfile.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With