Background
My position at work allows me a bit of freedom in how I code and more importantly, influence how others code. I was recently having a conversation with a colleague about what I think makes a good API, from a high level. The context of our discussion was pertaining to developing a C# based API, but this really applies to any object oriented API.
I had two key points that I wanted to address, and while they're not the only important things, I believe they're often overlooked. The first thing is how people will use your API, so how they will call methods and use the results. The second point was about how people will implement your API should they want to extend your work and implement their own classes. Here's what I was trying to drive home.
Usage
As a programmer, when you use an API, you want it to be simple. If you're using preexisting concrete classes, you want the methods to be easy to call and you want the results of those methods to be useful. How do you achieve this when making an API? Well, my guidelines are:
- Make your inputs to the method as generic as possible
- Make your return values as information-packed as possible
Simple right? If your inputs are generic enough, you can pass all sorts of things in. For example, if your function takes in a ReadOnlyCollection<string>
, this function wouldn't necessarily be as easy to use as one that takes only an IEnumerable<string>
. If it's not obvious, it's because IEnumerable<string>
is a far more generic type. With an IEnumerable<string>
, I can pass in an array, a list, a queue, a stack, or any collection. I can pass in anything that implements IEnumerable<string>
! Conversely, if I require a ReadOnlyCollection<string>
, all of my callers who may have these other various types of collections need to do some conversion to make it a ReadOnlyCollection<string>
.
To the second point, you want as much information as you can get when calling a function. It's almost the exact same argument as for parameters, but it works the opposite way. Consider if I have a function that returns an IEnumerable<string>
. That means that for anyone that calls my function, all they'll have access to is something they can enumerate over to get string
values. Okay, that's not too bad... But what if realistically everyone who calls your method really needs a list of string
s? What if the common way to use your method is to get the IEnumerable
result of your function, create a list out of it, and then add a few more items. Your API has basically created the additional step of requiring callers to create a list out of your return value. So... Why not just return a list? This is a lot more obvious if you look at your concrete implementation and notice that you likely do use something like a list (or some other concrete collection) when doing work inside the function. Why return something that's harder to use?
Implementation
The flip side to all of this is how other developers will implement the interfaces (or extend the classes) you provide in your API. And guess what? All of the arguments I just made for simplifying the life of the caller are essentially inverted for people implementing interfaces in your API.
If my interface calls for an IEnumerable<string>
to be passed in, then the only thing I can do is enumerate over it. Maybe in my own implementation this works fine... but what if someone else implementing your interface would benefit greatly from having a list? What if they can make an awesome optimization by knowing how many items are in the collection or by checking to see the if the 100th item is a particular value? Well, they can only enumerate, so this becomes difficult.
As for return types, before I argued that for the caller, returning as much information as possible is great. Consider this example. If in my API, I managed to create a custom collection class that has all sorts of awesome metadata. Just to make up something completely random, let's pretend I have a class for collections of integers and I have all these fancy properties on it that tell me the Mean/Median/Mode. The caller would say that's awesome! Sweet! So much information returned to me just by calling this simple function! However, the implementer of your interface is now thinking, "Oh crap... First you restrict my inputs to something super basic and then I have to somehow return that fancy object?! How the heck am I going to do that?!"
Summary
To summarize what I wrote here, I think a good guideline for APIs comes down to:
- Making inputs generic enough to ease the life of the caller and provide just enough information to the implementer of the method.
- Make return values as information-packed as possible without placing the burden of creating complex classes (and adding dependencies) to the implementer of the method.
Simple right? If your API is designed such that others will not be extending it (and it's really only people calling your methods), then you can completely bias your design in favour of the caller!<channel><category>CodeProject