Concurrency Pre-Con Highlights

I attended the “Concurrent, Multi-Core Programming on .NET and Windows” pre- conference talk at PDC. It was primarily a summary of the current state of concurrent programming on Windows (native and managed), with a preview of future advances in the platform.

David Callahan, a distinguished engineer at Microsoft, presented first, gave a summary of the “free lunch” being over and what the Parallel Computing Platform team at Microsoft is doing to address that problem. Their goal is to allow developers to easily express latent parallelism in their code, so that when manycore systems are available it can be turned into actual parallelism, giving software a different type of “free lunch”. One aspect of this long-term goal is to “eliminate multi-threading”; that is, removing explicit thread management by providing better abstractions on which to build concurrent applications. One takeaway was that even though we currently only have dual- or quad-core systems, we should overdecompose problems now to gain scalability in the future.

Stephen Toub and Joe Duffy presented sessions on the mechanisms that exist for concurrency right now (with quite a number of live demos of threads, thread pool, APM, BackgroundWorker, etc.), best practices for using those mechanisms (lock hierarchies, granularity, etc.), how you can write very low-level code (lock-free, memory barriers), and why you shouldn’t do that, but just use the types built into the Framework (and more are coming in .NET 4.0).

Some interesting points from these sessions were:

  • The object header that is used by Monitor.Enter to get a “thin lock” is also used by the CLR’s implementation of Object.GetHashCode to persist the hashcode; don’t call GetHashCode on an object you’re using for locking. (Note that this only applies to types that don’t override GetHashCode, but just use the default implementation.)
  • Interlocked operations are ultimately not scalable (due to inter-CPU communication); the more work that can be done in isolation by a thread, the better.
  • A good way to measure improvements being made to multi-threaded code is to count the number of CAS (compare and swap) instructions being executed (and try to get fewer).
  • Avoid Thread.Get/SetData; although it sounds like it’s using thread-local storage, it actually takes a lock on a global table.

Lastly, they covered improvements that will be coming in .NET 4.0. Essentially, the Parallel Extensions are becoming part of the core framework– they’ll even be integrated into mscorlib.dll, etc. instead of being supplied in a separate assembly (as in the CTPs). The TPL’s task scheduler will be baked into the standard ThreadPool so that there’s one master scheduler that controls all the background work for the process. PLINQ will, of course, be supported and, like all these new features, available to all .NET languages; there are no new language extensions or compiler changes required. New data structures (ConcurrentQueue, ConcurrentDictionary, etc.) will be supplied in a new System.Collections.Concurrent namespace; there will be new locks (SpinLock, ManualResetEventSlim, SemaphoreSlim, etc.) and other helper classes (blocking collections, CountdownEvent, etc.). Additionally, VC++ 10 is getting its own native concurrency libraries and task scheduler. More details on all of these items will be available at various “Deep Dive” talks later in the week.

Posted by Bradley Grainger on October 26, 2008