The Visitor Pattern allows new functionality to be added to a class hierarchy without modifying the hierarchy. It accomplishes this by having one (virtual) “accept” method that can call back many different visitor implementations.
The Visitor Pattern is a way of implementing double dispatch in languages (like C#) that don’t support it natively; as a result, many have argued (1, 2, 3) that the existence of this pattern is an indication of a missing language feature.
C# 4 addresses this limitation with the addition of the dynamic keyword, which allows method calls to be dispatched dynamically at runtime based on the runtime types of the objects involved. This can be used to implement the visitor pattern more simply.
Consider a typical pedagogical class hierarchy:
We implement the Visitor Pattern by defining an IAnimalVisitor interface, adding an abstract Accept(IAnimalVisitor) method to Animal, then implementing it on (usually only) the concrete derived types.
I’m going to assume that iterating through the collection of Animal objects is simple; in this example, they’re just stored in a List. The code to visit them (in the “classic” way) is as follows:
By using “dynamic” in C# 4, we can write the code as follows. At runtime, the
system will pick the best matching overload from the available methods and
call it. There’s no need for the object to provide a visitor interface so that
it can call us back; indeed, this lets us process class hierarchies (XNode
,
Expression
, etc.) that don’t implement the visitor pattern. (In this
example, Animal
is no longer required to expose the Accept(IAnimalVisitor)
method.)
Another advantage is that we can visit at a level in the hierarchy not supported by the visitor interface:
In order to see what the relative performance of these methods is like, I wrote three other visitors that use different strategies to accomplish the same goal: the “Reflection” visitor uses reflection and MethodInfo to invoke the correct method based on the type of object being visited; the “Type Checking” visitor uses hard-coded type tests to dispatch to the correct method; the “Delegate” visitor uses a hard-coded list of delegates to dispatch to the desired method. I also wrote a “Null” visitor, which does nothing but enumerate the list of animals (to establish a baseline).
In order to get nice big numbers, all timing was performed on a Pentium 4 3.4GHz computer with .NET 4 RC1. The number in the chart is the time (in milliseconds) that it took to invoke VisitAll one million times. (This data is fairly unscientific; I claim no statistical validity.)
For each method, I visited a different collection of objects: “Animals” was a mix of all four concrete types; “Mammals” had only types derived from Mammal, and “Lions” (obviously) contained only Lion objects.
Method/Collection | Animals | Mammals | Lions |
---|---|---|---|
Null | 326 | 338 | 388 |
Type Checking | 502 | 484 | 484 |
Classic | 661 | 625 | 546 |
Delegate | 1467 | 1509 | 1543 |
Dynamic 2 | 3526 | 1471 | 1362 |
Dynamic | 3589 | 1483 | 1290 |
Reflection | 30952 | 32180 | 31559 |
One obvious result is that using MethodInfo.Invoke is extremely slow; I had to truncate the bars in the chart to make them fit. (And this is after improvements that must have been made in .NET 4; the same code compiled for .NET 3.5 took 108s–more than three times longer–to run, while the other methods stayed the same.)
Using “dynamic” objects incurs a substantial startup cost: the first call to Visit took 106ms, while the second took 0.007ms (which is barely measurable). Reflection was quicker to initialize (6ms) but more costly(0.037ms) for each call. (These costs are not included in the table above.)
I used the three collections (Animals, Mammals, Lions) to test if the DLR optimises for situations where the runtime type is the same as the last time this method was called; the results seem to indicate that it does do this type of optimisation: the more homogenous the collection, the faster the method ran. In a typical visitor situation, the type of the visited object will be changing a lot, so be aware that there is a penalty for this type of input.
What happens if we add a new type (e.g., Eagle, derived from Bird, derived from Animal) to the hierarchy? Obviously all the visitors need to be updated to handle this new type and process it appropriately. The Classic, Dynamic, and Reflection visitors only need to have a new “Visit” method added; the Type Checking and Delegate visitors also need additional dispatching code. However, if the new Visit method isn’t added, the visitors will fail in the following ways (* means the exception is dependent on the implementation of the code; the listed exception is based on my sample code shown earlier):
The ‘dynamic’ keyword in C# 4 allows us to eliminate a lot of boilerplate code that was necessary because C# 3.0 couldn’t dispatch methods differently at runtime based on the types of their arguments, with an insignificant (for most applications) performance penalty (30ms on a collection of 10,000 objects). (Note that I haven’t profiled the impact on memory of using this technique; it may or may not be significant.)
A major drawback is the loss of compile-time type checking and early detection
of errors if the class hierarchy is modified. By using dynamic
, we also lose
refactoring support, “find usages”, and IntelliSense (if the dynamic object is
used for more than the simple cast shown above). To some degree, the stability
of the class hierarchy and the sophistication of accompanying test suites
mitigate these problems, but they are still significant problems to introduce.
Using dynamic
in the test suites themselves seems like a good way to improve
productivity, but the overhead of creating the additional tests necessary to
detect what would otherwise be compile-time errors may proscribe its use as a
replacement for the visitor pattern in production code.
The full source code file for the examples in this post is available at http://gist.github.com/324732.
Posted by Bradley Grainger on March 08, 2010