Creating hash codes

In order for a hashtable to perform well, there should be few collisions when inserting objects into the hashtable. A good way to accomplish this is to have an implementation of GetHashCode that returns uniformly-distributed hash codes (especially for non-uniformly-distributed input). Bob Jenkins wrote a Dr Dobb’s article that covers the basics of a good hash function, as well as providing his own implementation that is a “generally good, fast hash function”.

His code is in the public domain; since it’s useful for implementing Object.GetHashCode for custom equatable types, I ported it to C#. It’s available as HashCodeUtility.cs.

The main method, CombineHashCodes, should be used to combine a series of integers into a good hash code. These integers could either be the values of your type’s fields (e.g., X and Y coordinates), or the results of calling GetHashCode on the non-integral fields that comprise the type. A sample use would be:

struct Point
{
    public override bool Equals(object obj) { /* ... */ }

    public override int GetHashCode()
    {
        return HashCodeUtility.CombineHashCodes(m_x, m_y);
    }

    readonly int m_x;
    readonly int m_y;
}

CombineHashCodes takes a “params int[]” array to allow any number of ints to be hashed. Because GetHashCode should run efficiently, there are also overloads for the common cases of one to four ints, to avoid loops and the allocation of unnecessary arrays.

Posted by Bradley Grainger on February 09, 2010