Saturday, April 9, 2011

The story of IIterable

I like the .NET Framework. To me, C# is a much better programming language than Java or C++, and it offers good performance at the same time. However, some of the fundamental design decisions in the .NET framework bother me. Which brings me to the subject of this post: I'm remaking some of the .NET collections library, built on top of a replacement for IEnumerable<T> called IIterable<T>.

You see, I'm a bit of a performance nut. So it bothers me that the IEnumerator<T> interface requires two interface calls to retrieve each element of a collection: MoveNext() and Current. Therefore, I invented a replacement and named it Iterator<T>:
   public delegate T Iterator<out T>(ref bool ended);

It had a corresponding interface called IIterable<T>, which is analogous to IEnumerable<T>:
   public interface IIterable<out T>
{
Iterator<T> GetIterator();
}

Microbenchmarks show that this iterator has (at most) half the overhead of IEnumerator, as you would expect. More on that later.

When an iterator is called, it returns the next value in the sequence, and sets ended to true if there are no more elements. It should be noted that Iterator<T> was originally defined as follows:
   public delegate bool Iterator<out T>(out T current);

This version was supposed to behave like IEnumerator.MoveNext(), returning true on success and false if there were no more elements in the sequence. It also happened to return the next value in the sequence, eliminating the need for IEnumerator.Current. Code that used the original Iterator<T> might look like this:

T current;
for (var moveNext = list.GetIterator(); moveNext(out current);)
{
...
}

This was very similar to code that uses IEnumerator directly (instead of using a foreach loop):
   for (var e = list.GetEnumerator(); e.MoveNext();)
{
T current = e.Current;
...
}

Unfortunately, the .NET Framework did not allow this signature because the CLR does not support true "out" arguments--"out" arguments are the same as "ref" arguments but with a special attribute on them, so at the CIL level they still permit the caller to supply an input value. That makes them incompatible with the <out T> part of the declaration: since they technically accept an input value, they must be invariant, not covariant. Thus I had to change the definition as written above:
   public delegate T Iterator<out T>(ref bool ended);

That's unfortunate, because using the iterator is much clumsier this way. for-loops like the one above must be written like this instead:
   bool ended = false;
for (var it = list.GetIterator();;)
{
T current = it(ref ended);
if (ended)
break;
...
}

This clumsiness is avoided using an extension method:
   public static bool MoveNext<T>(this Iterator<T> it, out T value)
{
bool ended = false;
value = it(ref ended);
return !ended;
}

Unfortunately, there is a performance penalty for calling MoveNext(), which eliminates most of the performance gain you get from using Iterator in the first place.

Anyway, these "Iterators" are returned from IIterable<T>.GetIterator. There's more to it than this--I actually created a whole series of collection interfaces to fix other things I don't like about the design of .NET collection interfaces--but for this post I'm just going to focus on IIterable<T> and Iterator<T>.

I spent the last two days working on an implementation of LINQ-to-Iterators, and then discovered that there is a problem. A rather serious problem.

Iterator<T> and IIterable<T> and various other interfaces derived from it are covariant: Iterator<out T> and IIterable<out T>. This makes it possible to write pointless code such as this:
   IIterable<object> Combine(IIterable<string> a, IIterable<Uri> b)
{
return a.Concat<object>(b);
}

That is, IIterable<string> and IIterable<Uri> is implicitly convertible to IIterable<object>. It requires C# 4.0, but in theory you can do this with any version of the .NET Framework since 2.0. With a helper method written in C# 4.0, it's even possible to write code that does the same thing in C# 2.0.

However, as I began to implement LINQ-to-Iterable I realized that there is one major problem. In practice, any collection that implements IIterable<T> will also implement IEnumerable<T>, for fairly obvious reasons: Microsoft and third-party libraries expect IEnumerable<T>, and the foreach loop expects a GetEnumerator method.

But if a class implements both IIterable<T> and IEnumerable<T>, that class would suddenly have weird problems with LINQ. Why? Well, IEnumerable<T> supports LINQ, and once I finish LINQ-to-Iterable, IIterable<T> will support LINQ too, and that's precisely the problem. Let's consider what happens if I try to run a LINQ query on my Deque<T> collection that implements IIterable<T> and IEnumerable<T>:
   Deque<int> list = ...;
var odds = Iterable.CountForever(3, 2);
var primes = from p in list
where (p > 2 && (p & 1) == 1) || p == 2
where !odds.TakeWhile(n => n * n <= p).Any(n => p % n == 0)
select p;

The compiler complains because it doesn't know which LINQ implementation to use: "Multiple implementations of the query pattern were found for source type 'Loyc.Runtime.Deque<int>'. Ambiguous call to 'Where'." The problem disappears if I remove "using System.Linq" from the top of my source file, or if I do the query against "(IIterable<int>)list" instead of just "list". However, it's an inconvenience at best. At worst, if the user doesn't understand why the error occurred and how to solve it, it's a showstopper.

I solved this problem by changing the definition of IIterable as follows:
   public interface IIterable<out T> : IEnumerable<T>

This forces all implementations of IIterable to also implement IEnumerable, but it solves the problem because the compiler no longer considers the choice ambiguous: if a class implements IIterable<T>, the compiler will choose LINQ-to-iterable without complaint. The reason is that IIterable is now considered more specific, and the compiler prefers more specific method signatures.

The variance dilemma

Unfortunately, this causes a new problem: it refuses to compile for any .NET Framework version before 4.0! This is because Microsoft made a very strange decision: even though CLR version 2.0 supports generic variance, IEnumerable<T> is defined as invariant before version 4.0.

Interesting, is it not, how the interaction of seemingly unrelated issues--the way extension method resolution works in C# 3.0 (2006) and the decision Microsoft made in 2005 to mark IEnumerable<T> as invariant--combine to screw up my LINQ implementation?

I could only think of two ways to work around this problem.

(1) Restrict covariance to .NET 4.0, i.e. in .NET 2.0, make IIterable<T> invariant.

If I choose this option, not only do we lose covariance in .NET 2.0 and .NET 3.5 (which is sad because it's a cool feature), but it also forces me to release at least two DLLs, one for .NET 2.0 (which relies on LinqBridge for some features), and one for .NET 4.0. Probably a .NET 3.0 version is needed too, for those that don't want a dependency on LinqBridge.

If this problem didn't exist then you could have at least used the same DLL for .NET 3 and 4. Technically you could reference the .NET 3 DLL in .NET 4 and lose variance, but this would have the side effect of breaking interoperability with any program that uses the .NET 4 DLL instead.

(2) Don't derive IIterable<T> from IEnumerable<T>.

This option allows IIterable<T> to remain covariant, but causes conflicts with LINQ-to-Enumerable as I already described. I fear that a lot of developers, when confronted with the "Ambiguous call to 'Where'" error, will immediately say "screw this" and refuse to use my collection classes.

Therefore, my decision is to keep the following definition of IIterable<T>:
public interface IIterable<out T> : IEnumerable<T> { ... }

Blah, blah, blah

I have a couple of comments about the design of Iterator<T>. Firstly, unlike IEnumerator, Iterator<T> does not implement IDisposable (indeed, it can't, since it's a delegate). My main rationale for this is that if your collection's iterator (as opposed to the collection itself) needs to implement IDisposable, you may as well continue using IEnumerator<T>. Certainly if Microsoft had chosen to use my "fast iterator" implementation when it first gave birth to .NET, it should have included IDisposable:
   interface IEnumerator<out T> : IDisposable
{
bool MoveNext(out T current);
}

Of course, back in the original .NET Framework of 2002, they weren't especially concerned with performance yet. And they hadn't invented generics yet. Plus, to make this interface variance-ready, they would have had to formally define "out" parameters as equivalent to return values. And while they were at it they could have maybe implemented return value inheritance covariance. But I digress...

Also, the C# language makes Iterator<T> much easier to implement as a delegate. For example, the implementation of Iterator.CountForever is very simple:
   public static Iterator<int> CountForever(int start, int step)
{
start -= step;
return (ref bool ended) => start += step;
}

Although it would be possible to write a helper class that converts a lambda function to the hypothetical "IIterator<T>" interface, this would incur an extra interface invocation above the cost of invoking the lambda, not to mention an extra memory allocation. Therefore, IIterator<T> would generally not be any faster than IEnumerator<T> (without special compiler support to eliminate the overhead).

Secondly, you may have noticed that "ref bool ended" is "ref" instead of "out". My reasoning, again, is that if you're using Iterator instead of IEnumerator it's because you want to squeeze out every last drop of performance. Therefore, to save time, Iterators are not required to clear "ended" on every iteration; they only need to set it once, at the end.

One more thing. After I release this library, if you want to implement IIterable<T>, you can make your job easier by using IterableBase<T> or ListSourceBase<T> as your base class, which provides implementations of GetEnumerator(). If your collection class already has GetEnumerator, and you want to add IIterable<T>, you could add this method:
   public Iterator<T> GetIterator()
{
return GetEnumerator().AsIterator();
}

But that's not necessarily a good idea, because the iterator itself will be slightly slower than the IEnumerator it was derived from. Only if a multi-step LINQ-to-Iterable query is built on top of this would there be any performance benefit. (I'll get around to some benchmarks later).

The performance of Iterator

I wrote some simple loops that call IEnumerator<T> (MoveNext and Current) on the one hand, and Iterator<T> on the other. The collections being enumerated just count numbers:
   public static IEnumerator<int> Counter1(int limit)
{
for (int i = 0; i < limit; i++)
yield return i;
}
public static Iterator<int> Counter2(int limit)
{
int i = -1;
return (ref bool ended) =>
{
if (++i < limit)
return i;
ended = true;
return 0;
};
}

On my machine, in a Release build, it basically takes 7.1 seconds to count to a billion if I use IEnumerator<int>, 3.552 if I invoke Iterator<int> directly, and 6.1 seconds if I call the Iterator<int>.MoveNext extension method. The raw Iterator is thus about twice as fast as IEnumerator, although my test is imperfect since it ignores loop overhead and the content of Counter1 and Counter2. (It should also be noted that results vary from run to run, and seemingly irrelevant changes to the benchmark code also change the results. Sometimes my benchmark reports that IEnumerator requires 250% as much time, and sometimes only 170%, depending on the weather and the test PC.)

Conclusion

With a performance difference of only 3.5 seconds in a billion iterations, it's fair to ask whether IIterable is really worth it. Well, in most circumstances it probably isn't. But LINQ queries tend to be composed of many simple components, which means that the time spent invoking MoveNext, Current, and your lambdas may be a large fraction of the total runtime. In the future I'll try to quantify the difference.

If you do find that your LINQ-to-objects queries are slow, it often means you're doing it wrong: maybe you're using an O(N^2) algorithm when you should have picked a O(N log N) algorithm. But I believe in "the pit of success": developers shouldn't have to work much harder to write fast code. Microoptimizations, like writing out a LINQ query longhand with "foreach" or "for" loops, and calling through a List<T> reference instead of IList<T>, should virtually never be necessary. Instead, the low-level code they use as their foundation should be as fast as possible.

It's a hobby of mine to explore how the most basic components of the .NET Framework can be adjusted for better performance, in order to slide high-level code closer to that "performance pit of success". I don't really expect you to go out of your way to use IIterable<T>, because most of the time the performance difference is quite small. However, those who want to finally write a video game, compiler, simulation or other computationally intensive program in a truly high-level language like C# with LINQ--instead of juggling pointers in C++ and writing "for (int i = 0; i < (int)collection.size(); i++)" on the chalkboard another thousand times--will appreciate libraries like the one I'm developing.

Microsoft, if you're reading this: yes, I would consider a job offer. Put me with the cool kids that are designing C# 5.0. I'll tack on support for IIterable via yield return, and needless to say I have many other ideas for improvement. Also, if you pay me to write Loyc and rename it C# 6.0, I promise not to complain.

Thanks for reading!

No comments: