Enumerators in C#

1. Enumerators and foreach

This section covers how to write enumerators by manually adding all the code that’s required to implement the enumeration interfaces. C# 2.0 introduces new functionality called iterators that make writing enumerators much simpler. Many of the concepts of iterators rely on an understanding of how to implement enumeration, so you should read this section as back­ground information rather than as a coding tutorial. We’ll cover iterators later in this chapter.

If an object can be treated as an array, it’s often convenient to iterate through the object using the foreach statement. To understand what’s required to enable foreach, it helps to know what’s happening behind the scenes.

When the compiler sees the following foreach block:

foreach (string s in myCollection)

{

Console.WriteLine(“String is {0}”, s);

}

it transforms the code into the following:

IEnumerator enumerator = ((IEnumerable) myCollection).GetEnumerator();

while (enumerator. MoveNext())

{

string s = (string) enumerator.Current();

Console.WriteLine(“String is {0}”, s);

}

The first step of the process is to cast the collection class to IEnumerable. If that succeeds, the class supports enumeration, and an IEnumerator interface reference to perform the enumera­tion is returned. The MoveNext() and Current members of the class are then called to perform the iteration.

The IEnumerator interface can be implemented directly by the container class, or it can be implemented by a separate private class. Private implementation is preferable, since it simplifies the collection class and allows multiple users to iterate over the same instance at the same time.

The following example shows an integer collection class that enables foreach usage (note that this isn’t intended to be a full implementation of such a class):

using System;

using System.Collections;

// Note: This class is not thread-safe

public class IntList: IEnumerable {

int[] values = new int[10];

int allocated = values.Length;

int count = 0; int revision = 0;

public void Add(int value)

{

// reallocate if necessary…

if (count + 1 == allocated)

{

int[] newValues = new int[allocated * 2];

for (int index = 0;

index < count; index++)

{

newValues[index] = values[index];

}

allocated *= 2;

}

values[count] = value;

count++;

revision++;

}

public int Count {

get

{

return(count);

}

}

void CheckIndex(int index)

{

if (index >= count)

throw new ArgumentOutOfRangeException(“Index value out of range”);

}

public int this[int index]

{

get

{

CheckIndex(index);

return(values[index]);

}

set

{

CheckIndex(index);

values[index] = value;

revision++;

}

}

public IEnumerator GetEnumerator()

{

return(new IntListEnumerator(this));

}

internal int Revision {

get

{

return(revision);

}

}

}

class IntListEnumerator: IEnumerator {

IntList intList;

int revision;

int index;

internal IntListEnumerator(IntList intList)

{

this.intList = intList;

Reset();

}

public bool MoveNext()

{

index++;

if (index >= intList.Count)

return(false);

else

return(true);

}

public object Current {

get

{

if (revision != intList.Revision)

throw new InvalidOperationException

(“Collection modified while enumerating.”);

return(intList[index]);

}

}

public void Reset()

{

index = -1;

revision = intList.Revision;

}

}

class Test {

public static void Main()

{

IntList list = new IntList();

list.Add(l);

list.Add(55);

list.Add(43);

foreach (int value in list)

{

Console.WriteLine(“Value = {0}”, value);

}

foreach (int value in list)

{

Console.WriteLine(“Value = {0}”, value);

list.Add(124);

}

}

}

The collection class itself needs only a couple of modifications. It implements IEnumerable and therefore has a GetEnumerator() method that returns an IEnumerator reference to an instance of the enumerator class that points to the current list.

The IntListEnumerator implements the enumeration on the IntList that it’s passed using the IEnumerator interface and therefore implements the members of that interface.

Having a collection change as it’s being iterated over is a bad thing, so these classes detect that condition (as illustrated in the second foreach in Main()). The IntList class has a revision number that it updates when the list contents change. The current revision number for the list is stored when the enumeration is started and then checked in the Current property to ensure that the list is unchanged.

2. Improving the Enumerator

The enumerator in the previous section has two deficiencies.

The first is that the enumerator isn’t compile-time type-safe but only runtime type-safe. If you write the following code:

IntList intList = new IntList();

intList.Add(55);

//…

foreach (string s in intList)

{

}

the error can’t be identified at compile time, but an exception will be generated when the code is executed. The reason that this can’t be identified at compile time is that IEnumerator.Current is of type object, and in the previous example, converting from object to int is a legal operation.

A second problem with Current being of type object is that returning a value type (such as int) requires that the value type be boxed. This is wasteful, since IntListEnumerator.Current boxes the int only to have it immediately unboxed after the property is accessed.

To address this situation, the C# compiler implements a pattern-matching approach instead of a strict interface-based approach when dealing with enumerators. Instead of requiring the collection class to implement IEnumerable, it has to have a GetEnumerator() method. This method doesn’t have to return IEnumerator but can return a real class instance for the enumerator. This enumerator, in turn, needs to have the usual enumerator functions (MoveNext(), Reset(), and Current), but the type of Current doesn’t have to be object.

With this modification, a strongly typed collection class can now get compile-time type checking, and classes that store value types can avoid boxing overhead. The modifications to the classes are fairly simple. First, remove the interface names. Modify IntList.GetEnumerator() as follows:

public IntListEnumerator GetEnumerator()

{

return(new IntListEnumerator(this));

}

Second, the modification to IntListEnumerator.Current is also minimal:

public int Current {

get

{

if (revision != intList.Revision)

throw new InvalidOperationException (“Collection modified while enumerating.”);

return(intList[index]);

}

}

That was easy.

Unfortunately, there’s a problem. The standard method of enabling enumeration is to implement IEnumerable and IEnumerator, so any language that looks for those isn’t going to be able to iterate over the IntList collection.

The solution is to add explicit implementations of those interfaces. This means adding an explicit implementation of IEnumerable.GetEnumerator() to IntList:

IEnumerator IEnumerable.GetEnumerator()

{

return(GetEnumerator());

}

It also means adding an explicit implementation of IEnumerator.Current to IntListEnumerator:

object IEnumerator.Current {

get

{

return(Current);

}

}

This now enables the standard method of iteration, and you can use the resulting class either with a compiler that supports the strongly typed pattern-matching approach or with a compiler that supports IEnumerable/IEnumerator.

3. Disposable Enumerators

Sometimes an enumerator holds a valuable resource, such as a database connection. The resources will be released when the enumerator is finalized, but it’d be useful if the resource could be released when the enumerator was no longer needed. Because of this, the expansion of a foreach isn’t quite as simple as implied previously.

The C# compiler does this by relying on the IDisposable interface, in a similar manner to the using statement. It’s a bit more complicated in this case, however. For the using statement, it’s easy for the compiler to determine whether the class implements IDisposable, but that’s not true in this case. The compiler must handle three cases when it expands:

foreach (Resource r in container) …

3.1. GetEnumerator() Returns IEnumerator

In this case, the compiler must determine dynamically whether the class implements IDisposable. The foreach expands to the following:

IEnumerator e = container.GetEnumerator(); try {

while (e.MoveNext()) {

Resource r = e.Current;

…;

}

}

finally {

IDisposable d = e as IDisposable;

if (d != null) d.Dispose();

}

3.2. GetEnumerator() Returns a Class That Implements IDisposable

If the compiler can statically know that a class implements IDisposable, the compiler will call Dispose() without the dynamic test:

IEnumerator e = container.GetEnumerator(); try {

while (e.MoveNext()) {

Resource r = e.Current;

…;

}

}

finally {

((IDisposable) e).Dispose();

if (d != null) d.Dispose();

}

3.3. GetEnumerator() Returns a Class That Implements IDisposable

If the compiler can statically know that a class implements IDisposable, the compiler will call Dispose() without the dynamic test:

IEnumerator e = container.GetEnumerator();
try {

while (e.MoveNext()) {

Resource r = e.Current;
…;

}

}
finally {

((IDisposable) e).Dispose();

}

3.4. GetEnumerator() Returns a Class That Doesn’t Implement IDisposable

In this case, the normal expansion is used:

IEnumerator e = container.GetEnumerator();

while (e.MoveNext()) {

Resource r = e.Current;

…;

}

4. Design Guidelines

You should use indexers only in situations where the abstraction makes sense. This usually depends on whether the object is a container for some other object.

VB .NET views what C# terms an indexer as a default property, and a class can have more than one indexed property in a VB .NET program. Since C# views an indexer as an indication that an object is composed of the indexed type, the VB .NET view doesn’t map into the C# perspective (how can an object be an array of two different types?). C# therefore allows access to the default indexed property directly only.

C# gives its indexer the name Item, which is fine from the C# perspective, because the name is never used. Languages such as VB .NET, however, do see the name, and it may therefore be helpful to set the name to something other than Item. You can do this by placing the IndexerNameAttribute on the indexer. (You can find this attribute in the System.Runtime.CompilerServices namespace.)

Finally, you can use iterators (covered next) to implement enumerable types. Iterators offload the tedious work of enumerator implementation to the compiler, reducing the chance of bugs and making types easier to code.

Source: Gunnerson Eric, Wienholt Nick (2005), A Programmer’s Introduction to C# 2.0, Apress; 3rd edition.

Leave a Reply

Your email address will not be published. Required fields are marked *