Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Is it Really Better to 'Return an Empty List Instead of null'? / Part 1

0.00/5 (No votes)
17 Dec 2014 8  
This article aims to answer the question: Should we return an empty list or 'null' from functions?

Part I: Introduction

A Popular Advice: Don't Return Null

There is a popular and widely accepted advice in the world of software development that says:

You should always return an empty list instead of null!

Proponents of this rule argue that there are two considerable advantages:

  • You eliminate the risk of a null pointer error (i.e. NullReferenceException in C#, NullPointerException in Java, etc.)

  • You don't have to check for null in client code - your code becomes shorter, more readable and easier to maintain

Some people are very opinionated about the 'return an empty list instead of null' rule, such as demonstrated by the following excerpt of the top rated answer at the Stackoverflow question: Is it better to return null or empty collection?:

Empty collection. Always.

This sucks:

if(myInstance.CollectionProperty != null)
{
   foreach(var item in myInstance.CollectionProperty)
      /* arrgh */
}

It is considered a best practice to NEVER return null when returning a collection or enumerable. ALWAYS return an empty enumerable/collection. It prevents the aforementioned nonsense, and ...

The commenter reveals another benefit:

... it prevents your car getting egged by co-workers and users of your classes.

This advice is also supported by some prominent and influential voices:

  • Book Clean Code; by Robert C. Martin; page 110: Don't Return Null

  • Book Effective Java 2nd edition; by Joshua Bloch; page 201: Item 43: Return empty arrays or collections, not nulls

  • Book Framework Design Guidelines 2nd edition; by Krzysztof Cwalina and Brad Abrams; page 256: DO NOT return null values from collection properties or from methods returning collections. Return an empty collection or an empty array instead.

  • Book Pattern of Enterprise Architecture; by Martin Fowler; page 496 (Special Case): Instead of returning null, or some odd value, return a Special Case that has the same interface as what the caller expects.

Do we have any good reason(s), then, to question this established advice and ask: Is it really better to 'return an empty list instead of null'?

A list An empty list null

If we read through related questions in Stackoverflow and other forums, we can see that not all people agree. There are many different, sometimes truly opposite opinions. For example, the top rated answer in the Stackoverflow question Should functions return null or an empty object? (related to objects in general, not specifically to lists) tells us exactly the opposite:

Returning null is usually the best idea ...

Functions that return lists are omnipresent. They appear often in virtually all applications and all programming languages. The aim of this article series is to have a profound look at this important and recurring topic. Which are the pros and cons of possible approaches? Is there a one-size-fits-all right approach? Are there special cases? Which approach should you apply in your code?

Let's see!

[Note] Note

This is a follow-up article to a previous article with the title Why We Should Love 'null'. While the previous article focused on the general case of returning null from a function, this article is about the more special case of functions that return a list. In this article, I am going to refer sometimes to the Why We Should Love 'null' article, so you might want to read that article first (or at least the last short section called 'Final conclusion').

[Note] Note

In this article, I am going to use the term list. But the discussion encompasses all types of iteratables, enumerations and collections such as sets, lists, maps, dictionaries, arrays, strings, etc.

A Simple Example

Before analysing different cases, let us first look at a simple example of code that illustrates the rationale behind the advice 'return an empty list and not null'.

[Note] Note
In this article, I use Java in the source code examples. However, you can try out these simple examples in any other language you prefer. The principles we are going to see apply to all popular programming languages.
[Note] Note
Experienced programmers well acquainted with the subject can safely skip this example and jump to the next section.

Here is a method that returns a list of digits found in a string:

public static List<Character> getDigitsInString ( String string ) {

   // 'null' as input is not allowed!
   if ( string == null ) throw new IllegalArgumentException ( "Input cannot be null." );

   List<Character> result = new ArrayList<>();

   for ( char ch : string.toCharArray() ) {
      if ( Character.isDigit ( ch ) ) {
         result.add ( ch );
      }
   }

   return result;
}

As we can see, the above method follows the advice to never return null. If no digit is found in the input string, an empty list is returned.

To use this method, we would write client code like this:

String s = "asd123";
List<Character> digits = getDigitsInString ( s );
System.out.println ( "Number of digits found in '" + s + "': " + digits.size() );

Executing the above statements displays:

Number of digits found in 'asd123': 3
[Note] Note

If you want to try out the above example without using an IDE, then proceed as follows:

  • Ensure first that Java is properly installed on your system.

  • Create file TestList.java in any directory with the following content:

    import java.util.*;
    
    public class TestList {
    
       public static List<Character> getDigitsInString ( String string ) {
    
          // 'null' as input is not allowed!
          if ( string == null ) throw new IllegalArgumentException ( "Input cannot be null." );
    
          List<Character> result = new ArrayList<>();
    
          for ( char ch : string.toCharArray() ) {
             if ( Character.isDigit ( ch ) ) {
                result.add ( ch );
             }
          }
    
          return result;
       }
    
       public static void main ( String[] i_arguments ) {
    
          String s = "asd123";
          List<Character> digits = getDigitsInString ( s );
          System.out.println ( "Number of digits found in '" + s + "': " + digits.size() );
       }
    }
  • Open an OS terminal in the directory of file TestList.java and compile the file by typing:

    javac TestList.java
  • Run the file by typing:

    java TestList
[Note] Note

Each time the above method is called and returns with an empty list, a new empty list is created. To avoid this, we should return a shared and immutable empty list instead. This can be done with Collections.emptyList(). Moreover, it would be better to always return an immutable list. In a real world application, we would therefore replace ...

return result;

... with:

if ( result.isEmpty() ) {
   return Collections.<Character>emptyList();
} else {
   return Collections.unmodifiableList ( result );
}

To understand the rationale of the "don't return null" advice, let's change method getDigitsInString so that it returns null if there are no digits in the input string:

public static List<Character> getDigitsInString ( String string ) {

   // 'null' as input is not allowed!
   if ( string == null ) throw new IllegalArgumentException ( "Input cannot be null." );

   List<Character> result = new ArrayList<>();

   for ( char ch : string.toCharArray() ) {
      if ( Character.isDigit ( ch ) ) {
         result.add ( ch );
      }
   }

   if ( result.isEmpty() ) {
      return null;
   } else {
      return result;
   }
}

Now, executing the same above test code still returns the correct result:

Number of digits found in 'asd123': 3

But what happens if the input string doesn't contain a digit, as in the code below:

String s = "asd";
List<Character> digits = getDigitsInString ( s );
System.out.println ( "Number of digits found in '" + s + "': " + digits.size() );

A NullPointerException occurs, caused by digits.size(), because digits is null.

To avoid this, we have to check for null and our code becomes:

String s = "asd";
List<Character> digits = getDigitsInString ( s );
if ( digits != null ) {
   System.out.println ( "Number of digits found in '" + s + "': " + digits.size() );
} else {
   System.out.println ( "Number of digits found in '" + s + "': 0" );
}

We can improve a little bit:

String s = "asd";
List<Character> digits = getDigitsInString ( s );
int num_digits = digits != null ? digits.size() : 0;
System.out.println ( "Number of digits found in '" + s + "': " + num_digits );

When comparing both solutions (returning an empty list vs returning null), it becomes clear why many developers agree with the "return an empty list" advice. The code of the first solution is shorter, produces less aches in the typing fingers, and does what it is supposed to do without the risk of throwing the infamous NullPointerException.

So ...

What's the Problem?

Before looking at some serious flaws and pitfalls of the 'return an empty list' approach, we first need to have a look at a more general problem that plagues many programmers: the problem of standardization and reliable documentation of APIs that return lists.

To illustrate the problem, let's look at the following method signature of class java.io.File:

public File[] listFiles()

This method can be used to get the list of files contained in a directory. For example, to list the files in C:\Temp\, we could write:

File directory = new File ( "C:\\Temp\\" );
File[] files = directory.listFiles();
for ( File file : files ) {
   System.out.println ( file );
}

An important question arises immediately: What does this method return if the directory is empty? Does it return an empty array or null?

By just looking at the method signature, we can't know. In the world of Java (and also in many other environments), there is no standard rule applied - such as 'always return an empty list' or 'always return null'. In some environments, there might be a general recommendation or a project guideline that should be applied by all programmers, but such a rule can't be checked and enforced by the language or compiler - a programmer could accidentally violate the rule. For example, Microsoft officially recommends in its Guidelines for Collections to 'NOT return null values from methods returning collections'. But this doesn't mean we can count on it. A commenter to the Stackoverflow question Best explanation for languages without null puts it like this (see second top rated answer):

... every time I access a reference type variable in .NET, I have to consider that it might be null.

Often, it will never actually be null, because the programmer structures the code so that it can never happen. But the compiler can't verify that, and every single time you see it, you have to ask yourself "can this be null? Do I need to check for null here?"

Ideally, in the many cases where null doesn't make sense, it shouldn't be allowed.

That's tricky to achieve in .NET, where nearly everything can be null. You have to rely on the author of the code you're calling to be 100% disciplined and consistent and have clearly documented what can and cannot be null, or you have to be paranoid and check everything.

Hence, to know what listFiles() returns if the directory is empty, we can:

  • look at the documentation (if it exists and is up to date)

  • write a test routine to find out which value is returned for empty directories

  • look inside the source code of the Java Development Kit (JDK) (i.e. file src.zip in the JDK's root directory).

In this case of a standard Java class, we are lucky. The documentation exists. This is the answer, retrieved from the Java 8 online API documentation:

Returns: An array of abstract pathnames denoting the files and directories in the directory denoted by this abstract pathname. The array will be empty if the directory is empty. Returns null if this abstract pathname does not denote a directory, or if an I/O error occurs.

Throws: SecurityException - If a security manager exists and its SecurityManager.checkRead(String) method denies read access to the directory

Ok, now we know we have to test for an empty array and for null, for example:

File directory = new File ( "C:\\Temp\\" );
File[] files = directory.listFiles();
boolean files_found = files != null && files.length > 0;
[Note] Note
To keep this example code simple, I didn't write a try / catch statement to handle SecurityException and I didn't (and couldn't!) consider the weird fact that in case of an I/O error, the method returns null instead of throwing an exception.

Sometimes we are not so lucky. There is no documentation telling us if an empty list or null is returned. And we don't have time to write a test routine for every function that returns a list or to study the method's implementation (if we have access to the source code).

In those cases, the best we can do is to check for an empty list and for null, as we did in the example above. But this is annoying (to put it mildly) and reduces execution speed because either of the two tests is unnecessary if the function never returns null or never returns an empty list.

Anyway, this is what we should do. But what happens in practice? Some programmers actually check for an empty list and for null. Some check for an empty list only. Some check for null only. And some don't check anything.

It really gets nasty! (But please stay with us. We will find an easy solution at the end. Promised!)

Suppose that a function can potentially return an empty list or null or - why not - sometimes an empty list and sometimes null, depending on the system's state. Suppose also that nothing in the programming environment prevents programmers from doing the right or the wrong thing. Then we end up with an impressive set of possible combinations. An interesting question arises: What will be the outcome for each case if the function returns with 'no data' and we are actually using the result, for example to get the number of elements in the list.

The outcomes are shown in the table below.

The following pictures (by Visualpharm; licensed under CC BY-ND) are used to denote the outcome for each case:

: Ok; correct execution

: A null pointer error occurs

: The outcome is undefined

Table 1. Different outcomes of handling a function's return value

What the client code does
to check for 'no data'
Outcome if function returns
null empty list
nothing
if ( list.isEmpty() )
if ( list == null )
if ( list.isEmpty() || list == null )
if ( list == null || list.isEmpty() )

Here is an example of how to interpret the table: If a function returns null (1st column), but we just check for an empty list (2nd row), then a null pointer error will occur.

[Note] Note

To be more complete, we could add one more column and two more rows, to end up with a total of not less than 21 different cases!

We could add another column for the case of a function that throws an exception to signal 'no data'.

Moreover, we could add two more rows for the case of using the inclusive or operator (|) instead of the logical or operator (||). They both exist in C#, Java and other programming languages, but they are rarely used in practice.

However, the table above covers already all cases that are relevant to our discussion.

There are some interesting points to assimilate from the above table:

  • The only approach that works in all cases is to check for an empty list and for null. But this approach has a number of serious inconveniences: There is a time penalty. It is cumbersome to write. We have to be careful to first check for null and then for isEmpty(). And we have to use the right or operator (i.e. || and not |). We can alleviate this pain by using an existing or a self-made utility function that does the check. For example, to check a string, C# has String.IsNullOrEmpty() and in Java, we could use com.google.common.base.Strings.isNullOrEmpty() (from Google Guava) or org.apache.commons.lang3.StringUtils.isEmpty() (from Apache Commons).

  • If we lived in an ideal world of consistency where

    or

    then the code would be correct.

    But we don't live in a perfect word. Moreover, writing correct code (i.e. no bugs) doesn't necessarily mean writing the best code, because other factors such as performance, memory usage and maintainability are also important.

    • all functions return an empty list and all client code checks for an empty list

    • all functions return null and all client code checks for null

  • The most relevant point, however, is this:

    In this context 'undefined' means that the outcome is unpredictable and can vary from totally harmless to extremely harmful. We will look at examples later.

    This is a crucial point!

    We all want to write high-quality and bug-free code in the shortest time possible. So what we have to ask now is this:

    Which approach is generally better? An approach that leads to a null pointer error in case of a bug or an approach that leads to an undefined outcome?

    This decisive question is the subject of the next part in this article series. We will compare the two approaches by looking at some typical source code examples and we will consider software reliability, time and space requirements, as well as API differences. Then, we will look at empty lists in real life. Do they exist? How are they used? The outcome might surprise you.

    • If the function returns null and the client code does the wrong thing, then the outcome is always a null pointer error (see first column).

    • If the function returns an empty list and the client code does the wrong thing, then the outcome is always undefined (see second column).

Links to Related Articles

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here