The Devil is in the Details

« NSTrackingArea and Scrolling: Defective By Design?

Recently, while investigating differences in the performance of the Windows and Mac code bases in Logos 4, I came across a section of the Mac code that was spending an inordinate amount of time in System.IO.BufferedStream.ReadByte().

Mono’s BufferedStream.ReadByte() method was implemented this way:

public override int ReadByte ()
{
    CheckObjectDisposedException ();
    byte[] b = new byte[1];
    if (Read(b, 0, 1) == 1) {
        return b[0];
    } else {
        return -1;
    }
}

Similarly, BufferedStream.WriteByte() was implemented as a trivial call to Write with a new 1-byte array. Interestingly enough, MSDN’s documentation for Stream.ReadByte() explicitly warns against this kind of implementation.

Notes to Implementers:
The default implementation on Stream creates a new single-byte array and then calls Read. While this is formally correct, it is inefficient. Any stream with an internal buffer should override this method and provide a much more efficient version that reads the buffer directly, avoiding the extra array allocation on every call.

Every call incurs an allocation, 1-byte memory copy and a garbage collection that is clearly sub-optimal. I don’t wish to defame the code of the Mono team. They are a bunch of smart and helpful coders. I believe they were actually following the well known coding discipline of “Code first, optimize later”. Really, who reads a BufferedStream one byte at a time anyway? Most of the time you’re doing buffered I/O, don’t you want as much data as you can pull across as quickly as possible? Except that in our case, we really do only want one byte.

In a completely non-scientific test case (Zed Shaw, please don’t kill me [1]), reading a 1.5 GB file one byte at a time with the existing BufferedStream.ReadByte took over 12 minutes on my Late 2008 13.3” MacBook. Using FileStream.ReadByte took about 45 seconds.

We submitted a patch to Mono for this implementation that fixes ReadByte and WriteByte. Now our code runs comparably to the Windows version for this feature. The fix wasn’t anything that a first year Computer Science student couldn’t have fixed. The moral: Just because the code you are using has been implemented for a while does not mean that it has been optimized.

[1] - Zed Shaw has a very telling blog entry about programmers and their sophomoric approach to statistical testing. While some of the language may offend some of our readers, the points are well taken. If you’re not offended by some coarse language, you can find the post here: http://zedshaw.com/essay s/programmer_stats.html

Posted by Tom Philpot on January 12, 2010