Introduction
LINQ is a powerful technique for operating on collections in .NET, but when employed to dictionaries the behavior is not entirely intuitive. I have implemented a number of C# extension methods intended to provide a more convenient and intuitive experience when using LINQ to perform standard operations on dictionaries.
This article was previously published on my blog, cureos.blogspot.com.
LINQ and dictionaries
LINQ is a very powerful technique for operating on collections of objects in .NET. If you for example have a collection of integers, it is a simple task to pick the even numbers from the collection using LINQ:
int[] ints = new[] { 1, 3, 6, 8, 9, 10 };
IEnumerable<int> evens = ints.Where(i => i % 2 == 0);
Similarly for dictionaries:
Dictionary<int, int> intDict = new Dictionary<int, int> { { 2, 3 }, { 3, 5 }, { 6, 7}};
Dictionary<int, int> evenKeys = intDict.Where(kv => kv.Key % 2 == 0);
What? Compilation error?!?
In fact, the intDict.Where()
statement is not entirely correct. From the LINQ point of view,
Dictionary<,>
and other classes implementing the IDictionary<TKey, TValue> interface are actually regarded
as implementations of the IEnumerable<KeyValuePair<TKey, TValue>>
interface. Thus (ignoring for a moment that we could also
have used the var keyword), the correct
evenKeys
assignment should read:
IEnumerable<KeyValuePair<int, int>> evenKeys = intMap.Where(kv => kv.Key % 2 == 0);
Now, my guess is that in the normal case one would rather have the above assignment to return a
Dictionary
. Fortunately, LINQ also provides a number of ToDictionary method
overloads. So for the evenKeys
assignment to return a Dictionary
we simply type:
Dictionary<int, int> evenKeys = intMap.Where(kv => kv.Key % 2 == 0).ToDictionary();
What?! Compilation error again?
Yes, because the ToDictionary
method also operates on IEnumerable<T>
objects. You need to tell the compiler how
you want to design your Dictionary
based on this arbitrary type
T
. For correctness, our evenKeys
assignment has to be expressed as follows:
Dictionary<int, int> evenKeys =
intMap.Where(kv => kv.Key % 2 == 0).ToDictionary(kv => kv.Key, kv => kv.Value);
For dictionaries, this explicitness may appear quite counter-intuitive, and there are several forum questions on the Internet indicating that this API design indeed has caused confusion (here, here and here for example).
After giving this issue some thought, I believe I have come up with an approach to bypass this hurdle of confusion. I have implemented the following extension method in a static utility class:
public static Dictionary<TKey, TValue> ToDictionary<TKey, TValue>(
this IEnumerable<KeyValuePair<TKey, TValue>> source)
{
return source.ToDictionary(kv => kv.Key, kv => kv.Value);
}
This overload of ToDictionary
takes any object that implements the IEnumerable<KeyValuePair<TKey, TValue>>
interface, which is a common
return type when invoking LINQ operations on dictionaries, and returns a Dictionary<TKey, TValue>
object using the same keys and values
as the dictionary in the argument list. With this extension method defined, I now actually can enter:
Dictionary<int, int> evenKeys = intMap.Where(kv => kv.Key % 2 == 0).ToDictionary();
without getting the annoying compilation error.
Now this is all good and well, but I then decided to take the issue even one step further. Wouldn't it be good if for example the
Where()
extension method when operating on a Dictionary
by default returned a
Dictionary
with the same keys and values?
Well, it can of course be done! And here is the solution:
public static Dictionary<TKey, TValue> Where<TKey, TValue>(
this IDictionary<TKey, TValue> source, Func<KeyValuePair<TKey, TValue>, bool> predicate)
{
return Enumerable.Where(source, predicate).ToDictionary(kv => kv.Key, kv => kv.Value);
}
When this method is defined, it effectively hides the general
Where(IEnumerable<T>, Func<>)
extension method when the
IEnumerable<T>
object also implements the
IDictionary<TKey, TValue>
interface. Having the above
Where()
extension method defined now even makes it possible for us to apply our first dictionary LINQ attempt without compilation error:
Dictionary<int, int> evenKeys = intMap.Where(kv => kv.Key % 2 == 0);
Unfortunately, this method will always return a
Dictionary
object, regardless of whether intDict
is a
SortedDictionary or
SortedList or another object implementing the
IDictionary<TKey, TValue>
interface. At this point, I have no viable solution for returning the same dictionary type that was used as input to the specialized
Where
method. The quest for a solution will continue nonetheless.
I have also collected a few more similar LINQ extension method overloads in a
Github project that I have chosen to call ...
dictionarylinq!
In this project you may find dictionary overloads for the
Except,
Intersect and
Union extension methods. Due to their signatures, for these methods it has actually been possible to implement a specialized overload that returns the true input dictionary type, together with a designated overload with better performance for the
Dictionary<TKey, TValue>
class.
Included are also method overloads for the
Select method when the return type of the
Func<TSource, TResult>
object is a KeyValuePair<TKey, TValue>
. It would be relatively easy to add more method overloads to
dictionarylinq
, but for now this is what there is.
When the extension methods from the dictionarylinq
utility class are included in your application, these methods will effectively hide the general extension methods in the
System.Linq.Enumerable class. If you
do want to fall back on the general methods in a certain scenario, either make an explicit cast to the
IEnumerable<>
interface:
IEnumerable<KeyValuePair<int, int>> output =
((IEnumerable<KeyValuePair<int, int>>)input).Where(kv => kv.Key > 3);
or invoke the static method explicitly:
IEnumerable<KeyValuePair<int, int>> output = Enumerable.Where(input, kv => kv.Key > 3);
Please also note that the dictionarylinq
class library in the Visual Studio 2010 solution is a
Portable Class Library. This means that the library uses the "least common denominator" of .NET Framework 4, Silverlight and Windows Phone 7 base class libraries, and can easily be configured to support Windows Store applications. The library can be built once and consumed by all .NET technologies without re-building. In Visual Studio 2012 Portable Class Libraries are supported by default. If you are running Visual Studio 2010 and have not installed the
Portable Library Tools, simply create a new designated class library project and include the source code from the
dictionarylinq
class library, or include the source code directly in your own class library or application.
I hope that this effort can be useful also to others than myself. For questions and comments on this work, please do not hesitate to comment below. If you are a Github user, please feel free to report code issues via the project's
Issues tab.
Good luck with Dictionary and LINQ!
History
Nov 26, 2012: Initial version, adapted from http://cureos.blogspot.com/2011/10/linq-and-dictionaries.html.