The information in the preceding chapters is sufficient for writing objects that will function in the .NET runtime, but those objects won’t feel like they were written to operate well in the .NET Framework. This chapter details how to make user-defined objects operate more like the objects in the .NET runtime and the .NET Framework.
1. Things All Objects Will Do
Overriding the ToString() function from the object class gives a nice representation of the values in an object. If this isn’t done, object.ToString() will merely return the name of the class.
The Equals() function on object is called by the .NET Framework classes to determine whether two objects are equal.
A class may also override operator==() and operator! = (), which allows the user to use the built-in operators with instances of the object, rather than calling Equals().
ToString()
Here’s an example of what happens by default:
using System;
public class Employee
{
public Employee(int id, string name)
{
this.id = id;
this.name = name;
}
int id; string name;
}
class Test {
public static void Main()
{
Employee herb = new Employee(555, “Herb”);
Console.WriteLine(“Employee: {0}”, herb);
}
}
The preceding code results in the following:
Employee: Employee
By overriding ToString(), the representation can be much more useful:
using System; public class Employee {
public Employee(int id, string name)
{
this.id = id;
this.name = name;
}
public override string ToString()
{
return(String.Format(“{0}({1})”, name, id));
}
int id; string name;
}
class Test {
public static void Main()
{
Employee herb = new Employee(555, “Herb”);
Console.WriteLine(“Employee: {0}”, herb);
}
}
This gives you a far better result:
Employee: Herb(555)
When Console.WriteLine() needs to convert an object to a string representation, it will call the ToString() virtual function, which will forward to an object’s specific implementation. If you desire more control over formatting, such as implementing a floating-point class with different formats, you can override the IFormattable interface. Chapter 34 covers IFormattable.
2. Equals()
Equals() determines whether two objects have the same contents. This function is called by the collection classes (such as Array or Hashtable) to determine whether two objects are equal. Extending the employee example, you can write the following:
using System;
public class Employee
{
public Employee(int id, string name)
{
this.id = id;
this.name = name;
}
public override string ToString()
{
return(name + “(” + id + “)”);
}
public override bool Equals(object obj)
{
return(this == (Employee) obj);
}
public override int GetHashCode()
{
return(id.GetHashCode() ^ name.GetHashCode());
}
public static bool operator==(Employee empl, Employee emp2)
{
if (empl.id != emp2.id)
return(false);
if (empl.name != emp2.name)
return(false);
return(true);
}
public static bool operator!=(Employee empl, Employee emp2)
{
return(!(emp1 == emp2));
}
int id;
string name;
}
class Test {
public static void Main()
{
Employee herb = new Employee(555, “Herb”);
Employee herbClone = new Employee(555, “Herb”);
Console.WriteLine(“Equal: {0}”, herb.Equals(herbClone));
Console.WriteLine(“Equal: {0}”, herb == herbClone);
}
}
This produces the following output:
Equal: true Equal: true
In this case, operator==() and operator! = () have also been overloaded, which allows the operator syntax to be used in the last line of Main(). These operators must be overloaded in pairs; they can’t be overloaded separately.1
Note that in this example, the implementation of Equals() forwards to the operator implementation. For this example, you could do it in either way, but for structs, you’ll require an extra boxing operation if you do it the other way. Because Equals() takes an object parameter, a value type must always be boxed to call Equals(), but boxing isn’t required to call the strongly typed comparison operators. If the operators forwarded to Equals(), they’d have to box always.
3. Hashes and GetHashCode()
The .NET Framework includes the Hashtable class, which is useful for doing fast lookups of objects by a key. A hash table works by using a hash function, which produces an integer “key” for a specific instance of a class. This key is a condensed version of the contents of the instance. While instances can have the same hash code, it’s fairly unlikely to happen.
A hash table uses this key as a way of drastically limiting the number of objects that must be searched to find a specific object in a collection of objects. It does this by first getting the hash value of the object, which will eliminate all objects with a different hash code, leaving only those with the same hash code to be searched. Since the number of instances with that hash code is small, searches can be much quicker.
That’s the basic idea—for a more detailed explanation, please refer to a good data structures and algorithms book. Hashes are a tremendously useful construct. The Hashtable class stores objects, so it’s easy to use them to store any type.
The GetHashCode() function should be overridden in user-written classes because the values returned by GetHashCode() are required to be related to the value returned by Equals(). Two objects that are the same by Equals() must always return the same hash code.
The default implementation of GetHashCode() doesn’t work this way, and therefore it must be overridden to work correctly. If not overridden, the hash code will be identical only for the same instance of an object, and a search for an object that’s equal but not the same instance will fail.
If there’s a unique field in an object, it’s probably a good choice for the hash code:
using System; using System.Collections; public class Employee {
public Employee(int id, string name)
{
this.id = id;
this.name = name;
}
public override string ToString()
{
return(String.Format(“{0}({1})”, name, id));
}
public override bool Equals(object obj)
{
Employee emp2 = (Employee) obj;
if (id != emp2.id) return(false);
if (name != emp2.name) return(false);
return(true);
}
public static bool operator==(Employee emp1, Employee emp2)
{
return(emp1.Equals(emp2));
}
public static bool operator!=(Employee emp1, Employee emp2)
{
return(!emp1.Equals(emp2));
}
public override int GetHashCode()
{
return(id);
}
int id;
string name;
}
class Test {
public static void Main()
{
Employee herb = new Employee(555, “Herb”);
Employee george = new Employee(123, “George”);
Employee frank = new Employee(111, “Frank”);
Hashtable employees = new Hashtable();
employees.Add(herb, “414 Evergreen Terrace”);
employees.Add(george, “2335 Elm Street”);
employees.Add(frank, “18 Pine Bluff Road”);
Employee herbClone = new Employee(555, “Herb”);
string address = (string) employees[herbClone];
Console.WriteLine(“{0} lives at {1}”, herbClone, address);
}
}
In the Employee class, the id member is unique, so it’s used for the hash code. In the Main() function, several employees are created, and they’re then used as the key values to store the addresses of the employees.
If there isn’t a unique value, the hash code should be created out of the values contained in a function. If the Employee class didn’t have a unique identifier but did have fields for name and address, the hash function could use those. The following shows a hash function that could be used:
using System;
using System.Collections;
public class Employee {
public Employee(string name, string address)
{
this.name = name;
this.address = address;
}
public override int GetHashCode()
{
return(name.GetHashCode() + address.GetHashCode());
}
string name; string address;
}
This implementation of GetHashCode() simply XORs the hash codes of the elements together and returns them.
4. Design Guidelines
Any class that overrides Equals() should also override GetHashCode(). In fact, the C# compiler will issue an error in such a case. The reason for this error is that it prevents strange and difficult- to-debug behavior when the class is used in a Hashtable.
The Hashtable class depends on the fact that all instances that are equal have the same hash value. The default implementation of GetHashCode(), however, returns a value that’s unique on a per-instance basis. If this implementation isn’t overridden, it’s easy to put objects in a hash table but not be able to retrieve them.
4.1. Value Type Guidelines
The System.ValueType class contains a version of Equals() that works for all value types, but this version of Equals() works through reflection if bitwise quality is invalid for the type (which will be the case if the value type has references to reference types) and is therefore slow. It’s therefore recommended that an implementation of Equals() be written for all value types.
4.2. Reference Type Guidelines
For most reference types, users will expect that == will mean reference comparison and in this case == shouldn’t be overloaded, even if the object implements Equals().
If the type has value semantics (something like a String or a BigNum), operator==() and Equals() should be overridden. If a class overloads + or -, that’s a pretty good indication it should also override == and Equals().
A subtler area of concern is how Equals() operates when inheritance hierarchies come into play. Consider the following example:
using System;
class Base
{
int val;
public Base(int val)
{
this.val = val;
}
public override bool Equals(object o2)
{
Base b2 = (Base) o2;
return(val == b2.val);
}
public override int GetHashCode()
{
return(val.GetHashCode());
}
}
class Derived: Base {
int val2;
public Derived(int val, int val2) : base(val)
{
this.val2 = val2;
}
}
class Test {
public static void Main()
{
Base b1 = new Base(12);
Base b2 = new Base(12);
Derived d1 = new Derived(12, 15);
Derived d2 = new Derived(12, 25);
Console.WriteLine(“b1 equals b2: {0}”, b1.Equals(b2));
Console.WriteLine(“d1 equals d2: {0}”, d1.Equals(d2));
Console.WriteLine(“d1 equals b1: {0}”, b1.Equals(d1));
}
}
This code generates the following results:
b1 equals b2: True
d1 equals d2: True
b1 equals d1: True
The Base class implements Equals(), and it works as expected for objects of type Base. Classes derived directly from object (or from classes that don’t override Equals()) will work fine since they will use object.Equals(), which compares references.
But any class derived from Base will inherit the implementation of Equals() from Base and will therefore generate the wrong results. Because of this, any class that derives from a class that overrides Equals() should also override Equals().4 You can guard against this situation by adding a check to make sure the object is the expected type:
public override bool Equals(object o2)
{
if (o2.GetType() != typeof(Base) || GetType() != typeof(Base))
return(false);
Base b2 = (Base) o2;
return(val == b2.val);
}
This gives the following output:
b1 equals b2: True
d1 equals d2: False
b1 equals d1: False
This is correct and prevents a derived class from accidentally using the base class Equals() accidentally. It’s now obvious that Derived needs its own version of Equals(). The code for Derived.Equals() will use Base.Equals() to check whether the base objects are equal and then compare the derived fields. The code for Derived.Equals() looks like this:
public override bool Equals(object o2)
{
if (o2.GetType() != typeof(Derived) || GetType() != typeof(Derived)) return(false);
Derived d2 = (Derived) o2;
return(base.Equals(o2) && val2 == d2.val2);
}
Adding this code generates the following output:
b1 equals b2: True
d1 equals d2: False
b1 equals d1: False
That’s clearly wrong. What’s causing the problem is that the type check in the base class will always return false when called from Derived.Equals().
Since it doesn’t work to check for an exact type, the next best thing is to check that the types are the same. The code for Base.Equals() becomes the following:
public override bool Equals(object o2)
{
if (o2 == null || GetType() != o2.GetType()) return false;
Base b2 = (Base) o2;
return(val == b2.val);
}
And the code for Derived.Equals() uses the same check and also calls base.Equals(). This code also checks for null to prevent an exception when comparing to a null reference.
The following list summarizes this discussion:
- Reference types should make sure both types are the same in Equals().
- If the type is derived from a type that overrides Equals(), Base.Equals() should be called to check whether the base portion of the type is equal.
- If the type is derived from a type that doesn’t override Equals(), Base.Equals() shouldn’t be called since it’d be object.Equals(), which implements reference comparison.
Source: Gunnerson Eric, Wienholt Nick (2005), A Programmer’s Introduction to C# 2.0, Apress; 3rd edition.