Part of the project I'm working on got updated, and at some point somebody started sending empty collections instead of null
as arguments to a method.
This led to a single bug first, which then led me to change if (null == myCollection)
with if (CollectionUtils.isEmpty(myCollection))
, which in the end led to a cascade of several bugs. This way I discovered that a lot of the code treats these collections differently:
null
(i.e. the user did not mention anything here)Hence, my question: is this good or bad design practice?
Always. It is considered a best practice to NEVER return null when returning a collection or enumerable. ALWAYS return an empty enumerable/collection.
The fundamental problem of null is that it is trying to represent the fact that it is not a value while being assigned as a value. This fundamental flaw then snowballs and manifests into problems that we see in everyday production code.
Collections. emptyList() : method used to return an empty list. Collections. emptySet() : method used to return an empty set.
In his (very good) book Effective Java, Joshua Bloch treats this question for the returned values of methods (not "in general" like your question) :
(About use of null) It is errorprone, because the programmer writing the client might forget to write the special case code to handle a null return.
(...)
It is sometimes argued that a null return value is preferable to an empty array because it avoids the expense of allocating the array. This argument fails on two counts. First, it is inadvisable to worry about performance at this level unless profiling has shown that the method in question is a real contributor to performance problems (Item 55). Second, it is possible to return the same zero-length array from every invocation that returns no items because zero-length arrays are immutable and immutable objects may be shared freely (Item 15).
(...)
In summary, there is no reason ever to return null from an array- or collection-valued method instead of returning an empty array or collection. (...)
Personnally I use this reasoning as a rule of thumb with any use of Collections in my code. Of course, there is some case where a distinction between null and empty makes sense but in my experience it's quite rare.
Nevertheless, as stated by BionicCode in the comments section, in the case of a method that returns null instead of empty to specify that something went wrong, you always have the possibility of throwing an exception instead.
I know this question already has an accepted answer. But because of the discussions I had, I decided to write my own to address this problem.
The inventor of NULL himself Mr. Tony Hoare says
I call it my billion-dollar mistake. It was the invention of the null reference in 1965. At that time, I was designing the first comprehensive type system for references in an object oriented language (ALGOL W). My goal was to ensure that all use of references should be absolutely safe, with checking performed automatically by the compiler. But I couldn’t resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years.
First of all it is very bad practice to introduce breaking changes when updating old code that was already released and tested. Refactoring in this situation is always dangerous and should be avoided. I know that it can hurt to release ugly or stupid code, but if this happened, then this is a clear sign of a lack of quality management tools (e.g. code reviews or conventions). But obviously there were no unit tests otherwise the changes would have been reverted immediately by the author of the changes, due to the failing tests. The main problem is that your team has no conventions how to handle NULL as argument or as result. The intention of the author of the update to switch from NULL arguments to empty collections is absolutely right and should be supported, but he shouldn't have done it for working code and he should have discussed with the team, so that the rest of the team can follow in order to make it more effective. Your Team definitely must come together and agree on abandoning NULL as an argument or result value or at least find a standard. Better send NULL to hell. Your only solution is to revert the changes done by the author of the update (I assume you are using version control). Then redo the update, but do it the old fashioned and nasty way using NULL like you were doing it before. Don't use NULL in new code - for a brighter future. Trying to fix the updated version will escalate the situation for sure and therefore will waste time. (I assume that we are talking about a bigger project). Roll back to the previous version if possible.
To make it short if you don't like to continue reading: yes it's very bad practice. You yourself can draw this conclusion from the situation you are in now. You now witness very unstable and unpredictable code. The irrational bugs are the proof that your code has become unpredictable. If you don't have unit tests for at least the business logic then the code is hot iron. The practice that has lead to this code can never be good. NULL has no intuitional meaning to the user of an API, when there is the possibility to get a neutral result or parameter (like this is the case with a collection). An empty collection is neutral. You can expect a collection to be empty or to contain at least one element. There is noting else you can expect from a collection. You want to indicate an error and that an operation couldn't terminate? Then prefer to throw an exception with a good name that communicates the root of the error clearly to the caller.
NULL is a historical relic. Stumbling over a NULL value meant that the programmer wrote sloppy code. He forgot to initialize a pointer by assigning a memory address to it. The compiler must know a memory address as a reference in order to create a valid pointer. If you wanted a pointer but don't waste memory for just declaring the pointer or didn't know the address to this point, NULL was a convention to have the pointer to point to nowhere, which means no memory must be allocated for the pointer's reference (except for the pointer itself). Today in modern OO-Languages with garbage collection and plenty of available memory, NULL has become irrelevant in programming. There are situations where it is used to express e.g. the absence of data like in SQL. But in OO programming you can absolutely avoid it and therefore make your application more robust.
You have the choice to apply the Null-Object Pattern. Also it is best practice to either use default parameters (if your language like C# supports this) or use overloads. E.g. if the method parameter is optional then use an overload (or a default parameter). If the parameter is mandatory but you can't provide the value, then simply don't call the method. If the parameter is a collection, then always use an empty collection whenever you have no values. The author of the method must handle this case as he must handle all possible cases or parameter states. This includes NULL. So it's the duty of the method's author to check for NULL and to decide how to handle it. Here kicks the convention in. If your team agreed on to never use NULL, this annoying and ugly NULL checks are not required anymore. Some frameworks offer a @NotNull
attribute. The author can use it to decorate the method parameters to indicate that NULL is not a valid value. The compiler will now do the NULL checks and show an error to the programmer that (mis)uses the method and simply won't compile. Alongside code reviews, this can help to prevent or identify violations and lead to more robust code.
Most libraries provide helper classes e.g. Array.Empty()
or Enumerable.Empty()
(C#) to create empty collections that provide methods like IsEmpty()
. This makes intentions semantically clear and code therefore nice to read. It's worth it to write your own helper if none exist in your standard library.
Try to integrate quality management into your team's routine. Do code reviews to make sure the code to be released is conform to your quality standards (unit tests, no NULL values, always use curly braces for statement bodies, naming, etc...)
I hope you can fix your problem. I know this is a stressful situation to clean up the mess of somebody else. This is why communication in teams is so important.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With