The Improving .NET Application Performance and
Scalability talk at PDC
previewed of the new lock contention mode in the Visual Studio 2010 profiler.
I haven’t seen a detailed list of capabilities, but starting at 57:15 in the
video you can see that it appears to identify synchronization objects that
cause threads to block, reports which locks have the highest contention, and
records the total amount of time spent waiting. This sounds like a great way
to identify why a concurrent app may be experiencing sub-linear speed-up.
There doesn’t appear to be any tools available right now that collect this
information, so I wrote a TimedMonitor class to help identify the locks in my
code that could be causing problems. To use it, you have to change the objects
you’re locking on to instances of the TimedMonitor class. The Lock() method
acquires a lock (using Monitor.Enter) and returns an IDisposable struct that
releases the lock. (Returning a struct incurs no heap allocation, and the
Dispose method gets inlined by the JIT, making this approach efficient.)
The full code for TimedMonitor is at the end of this post. It uses a
conditional directive named “ENABLE_TIMING”. If this is not set, the timing
code isn’t compiled into the assembly, the JITter completely inlines the Lock
method, and the performance of TimedMonitor is on par with the C# lock
keyword. Otherwise, the timing code is executed; TimedMonitor.Lock() is about
30x slower than lock on my Core 2 Duo (which is fine for most cases, when
this overhead is negligible compared to the work done inside the lock).
One final thing to note is that this code does not account for the case where
thread A calls Monitor.Pulse, and thread B (which had called Monitor.Wait)
wakes up and immediately has to block waiting for thread A to release the lock
(so that thread B can re-acquire it before it returns from Monitor.Wait). The
time B spends blocked will not be reported by TimedMonitor.
Update: The original code P/Invoked to QueryPerformanceCounter; I’ve since updated it to just use a Stopwatch (which simplifies the code and adds little additional overhead).