Persistent Hash Codes

What does "hello".GetHashCode() return?

The answer is: it depends.

Under the Microsoft .NET 4 CLR, that method returns 0xFFF561E1 in a 32-bit application, but 0xEC7BF82A in a 64-bit build. Other runtimes, such as Mono and the Compact Framework, might return other values.

As per the documentation, “The behavior of GetHashCode is dependent on its implementation, which might change from one version of the common language runtime to another. … The value returned by GetHashCode is platform-dependent. It differs on the 32-bit and 64-bit versions of the .NET Framework.”

In certain scenarios (e.g., saving a hash code to a file, or creating a hash code on a 32-bit client and using it on a 64-bit server), it’s useful to have an algorithm that will always generate the same hash code for the same input.

Based on Paul Hsieh‘s SuperFastHash, Bob Jenkin‘s hash function, and Thomas Wang‘s Integer Hash Function, we developed HashCodeUtility.GetPersistentHashCode. (The C# methods are basically a straightforward port of the corresponding C functions.)

These methods perform a thorough mixing of their input (so the return value works well as a key in a hashtable), have good performance, and will always return the same result no matter what version of the CLR they’re running on.

The source is available on GitHub: HashCodeUtility.cs, HashCodeUtilityTests.cs.

For information about the CombineHashCodes method in those files, see my earlier post, Creating hash codes.

Posted by Bradley Grainger on February 23, 2012