Iterators in C#

1. Iterators

Looking at the implementation of the IntListEnumerator class presented earlier in this chapter, it’s apparent that a basic state machine is implemented that keeps track of the element in the collection that should be returned by the Current property, and the element counter that forms part of the state machine is advanced by the MoveNext method. The pattern of implementing an IEnumerator-implementing object that enumerates over an array is well-established, and with a little assistance, the compiler would be able to produce the enumerator. This is the idea behind iterators, a new feature in C# 2.0 that significantly simplifies the process of writing enumerators.

You can implement an iterator using yield return statements, which return the values of a collection in an ordered manner. The introduction of yield as a new keyword may initially raise compatibility concerns for programmers with a large C# 1.0 code base that includes yield as a variable name, but this doesn’t cause any compatibility problems, as yield is a keyword only in the context of a yield return statement; you can still use yield without the return as a variable name in C# 2.0.

If you rewrite the IntList collection presented earlier to use iterators, you get the following:

public class IntList : IEnumerable {

int[] values = new int[l0]; int count = 0;

//other member variables

public IEnumerator GetEnumerator()

{

for (int index = 0; index < count; ++index)

{

yield return values[index];

}

}

Building the enumerator using iterators considerably cuts down the code the developer needs to write. Under the covers, the compiler will generate a nested class that’s functionally identical to IntListEnumerator presented earlier in this chapter, with the exception of the Reset method. Reset isn’t an overly important method, as it’s always possible to get an enumerator for a collection in its initial state by simply creating a new enumerator object. The code generated by an iterator will throw a NotSupportedException exception if Reset is called.

Iterators have no runtime support and are purely a C# compiler-provided feature.

By making it so easy to provide enumeration implementations, it’s more likely the developer will provide more enumeration offerings than the typical full-collection forward-only enumerator. Providing a bidirectional enumerator that will support enumeration over a subset of a collection is simple to implement using iterators:

public class IntList : IEnumerable {

//other methods and members

public IEnumerable BidirectionalSubrange(bool forward, int start, int end)

{

if (start < 0 || end >= count)

{

throw new IndexOutOfRangeException(“Start must be zero or greater and” +

” end must be less than the size of the collection”);

}

int step = forward == true? 1: -1;

for( int index = start; index != end; index += step )

{

yield return values[index];

}

}

}

You can use this method in C# by calling the BidirectionalSubrange method inside the foreach statement:

IntList il = new IntList();

il.Add(1);

il.Add(2);

il.Add(3);

il.Add(4);

foreach (int i in il.BidirectionalSubrange(false, 1, 3))

Console.WriteLine(i);

Notice that BidirectionalSubrange returned an IEnumerable reference, as opposed to the IEnumerator reference returned by the GetEnumerator method.

It’s possible to use multiple yield return statements within the one iterator. So, for example, you could enumerate the names of a person in the following way:

public class Person {

string firstName;

string middleName;

string lastName;

public Person(string firstName, string middleName, string lastName)

{

this.firstName = firstName;

this.middleName = middleName;

this.lastName = lastName;

}

public IEnumerable Names()

{

yield return firstName;

yield return middleName;

yield return lastName;

}

}

2. Complex Enumeration Patterns

Although the previous example was a little contrived, the ability to nest yield return statements makes writing enumerations over complex collections quite easy. Consider a hash table that allows multiple values to be stored against each key; this collection is useful for modeling many real-world relationships, such as keying each line item in an invoice from the invoice ID. Without iterators, it’d be quite difficult to build the state machine to provide an enumerator that returns a key, then each member of its value collection, followed by the next key, then each member of its value collection, and so on. Writing this enumerator with iterators is quite simple:

public class MutliMap {

Hashtable map = new Hashtable();

public void Add(object key, object value)

{

if (map[key] == null) map[key] = new ArrayList();

((ArrayList)map[key]).Add(value);

}

public IEnumerator KeysAndValues()

{

foreach (object key in map.Keys)

{

yield return key; if (map[key] != null)

{

ArrayList values = (ArrayList)map[key]; foreach (object value in values)

{

yield return value;

}

}

}

}

}

Notice the use of two separate yield return statements (highlighted with italics in the sample)—one for the key and another for each value.

To leave an iterator block, you can use the yield break statement. A yield break statement behaves in much the same way as a return statement, and if there’s a finally block within the iterator, it will be executed as a result of yield break. From the perspective of the client code, a yield break will appear as if the enumeration is complete, and subsequent calls to MoveNext will return false.

3. Generic Enumeration

You can also use iterators to implement generic enumeration. The only change required to implement a generic iterator is to change the return type of the function from IEnumerator and IEnumerable to IEnumerator<T> and IEnumerable<T>. The actual iterator code remains unchanged. Converting the earlier IntList class into a generic equivalent, you get the following:

// Note: This class is not thread-safe

public class GenericList<T> : IEnumerable<T>

{

T[] values = new T[10];

int allocated = values.Length;

int count = 0;

int revision = 0;

public void Add(T value)

{

// reallocate if necessary… if (count + 1 == allocated)

{

T[] newValues = new T[allocated * 2];

for (int index = 0; index < count; index++)

{

newValues[index] = values[index];

}

allocated *= 2;

}

values[count] = value;

count++;

revision++;

}

public int Count {

get

{

return (count);

}

}

void CheckIndex(int index)

{

if (index >= count)

throw new ArgumentOutOfRangeException(“Index value out of range”);

}

public T this[int index]

{

get

{

CheckIndex(index);

return (values[index]);

}

set

{

CheckIndex(index);

values[index] = value;

revision++;

}

}

public IEnumerator<T> GetEnumerator()

{

for (int index = 0; index < count; ++index)

{

yield return values[index];

}

}

public IEnumerable<T> BidirectionalSubrange(bool forward, int start, int end)

{

if (start < 0 || end >= count)

{

throw new IndexOutOfRangeException(“Start must be zero or greater and end must” + ” be less than the size of the collection”);

}

if (forward)

{

for (int index = start; index < end; ++index)

{

yield return values[index];

}

}

else

{

for (int index = end; index >= start; –index)

{

yield return values[index];

}

}

}

internal int Revision {

get

{

return (revision);

}

}

}

IEnumerator<T> doesn’t contain a Reset method, so the earlier discussion about Reset throwing a NotSupportedException exception doesn’t apply.

4. Design Guidelines

Iterators make writing enumerators over collection and collection-like classes much easier. By offloading the implementation of the state machine needed to track the position in the collection that a particular enumeration is up to, iterators dramatically simplify the C# code needed to implement even the most complex enumerations.

As with all code simplification features, particularly those that are new, there’s a tendency to initially overuse the feature. Although this isn’t overly harmful in the case of iterators, you should strive to achieve a balance between offering a rich set of enumeration options and avoiding bogging a class down with an excess of enumerators that can make changing the internal implementation of the collection laborious.

Source: Gunnerson Eric, Wienholt Nick (2005), A Programmer’s Introduction to C# 2.0, Apress; 3rd edition.

Leave a Reply

Your email address will not be published. Required fields are marked *