Nested Iterators, part 1

C# 2.0 introduced a powerful feature called an iterator, a method which returns an IEnumerable<T> or IEnumerator<T> using the new yield keyword.

Using an iterator, you can quickly and easily create a method which returns lazily a sequence of values.  However, lazily returning a sequence of sequences (IEnumerable<IEnumerable<T>>) is not so simple.

The obvious approach is to yield return a List<T>:

IEnumerable<IEnumerable<int>> SemiLazy() {
    for(int i = 0; i < 10; i++) {
        List<int> numbers = new List<int>();
        for(int j = 0; j < 10; j++)
            numbers.Add(i * 10 + j);
            
        yield return numbers;
    }
}

(This can be shortened to a single LINQ statement, but that’s beyond the point of this post: Enumerable.Range(0, 10).Select(i => Enumerable.Range(10 * i, 10)) )

This approach is very simple, but isn’t very lazy; each subsequence will be computed in its entirety, whether it’s consumed or not.

This approach also has a subtle catch: the iterator must return a different List<T> instance every time.

If you “optimize” it to re-use the instance, you’ll break callers which don’t use the subsequences immediately:

IEnumerable<IEnumerable<int>> Wrong() {
    List<int> numbers = new List<int>();
    for(int i = 0; i < 10; i++) {
        numbers.Clear();
        for(int j = 0; j < 10; j++)
            numbers.Add(i * 10 + j);
            
        yield return numbers;
    }
}

Calling SemiLazy().ToArray()[0].First() will return 0 (the first element in the first subsequence); calling Wrong().ToArray()[0].First() will return 90 (since all subsequences refer to the same instance).

Next: How can we achieve full laziness?

0 comments:

Post a Comment