[ The PC Guide | Systems and Components Reference Guide | Hard Disk Drives | Construction and Operation of the Hard Disk | Hard Disk Cache and Cache Circuitry ]

Cache Circuitry and Operation

The reason that the hard disk's cache is important is due to the sheer difference in the speeds of the hard disk and the hard disk interface. Finding a piece of data on the hard disk involves random positioning, and incurs a penalty of milliseconds as the hard disk actuator is moved and the disk rotates around on the spindle. In today's PCs, a millisecond is an eternity. On a typical IDE/ATA hard disk, transferring a 4,096-byte block of data from the disk's internal cache is over 100 times faster than actually finding it and reading it from the platters. That is why hard disks have internal buffers. :^) If a seek isn't required (say, for reading a long string of consecutive sectors from the disk) the difference in speed isn't nearly as great, but the buffer is still much faster.

Tip: While different in operation and technology from the system cache, the hard disk buffer is similar in concept and role. You may find the section discussing the system cache helpful if you want to understand more about caching in general.

Note: This section discusses caching in general, and especially as it applies to data reads from the hard disk. Writes to the disk have more issues involved with caching; most of what is in this section applies to writes but in addition there are other issues regarding caching writes.

General concepts behind the operation of an internal hard disk cache.

Image � Quantum Corporation
Image used with permission.

The basic principle behind the operation of a simple cache is straightforward. Reading data from the hard disk is generally done in blocks of various sizes, not just one 512-byte sector at a time. The cache is broken into "segments", or pieces, each of which can contain one block of data. When a request is made for data from the hard disk, the cache circuitry is first queried to see if the data is present in any of the segments of the cache. If it is present, it is supplied to the logic board without access to the hard disk's platters being necessary. If the data is not in the cache, it is read from the hard disk, supplied to the controller, and then placed into the cache in the event that it gets asked for again. Since the cache is limited in size, there are only so many pieces of data that can be held before the segments must be recycled. Typically the oldest piece of data is replaced with the newest one. This is called circular, first-in, first-out (FIFO) or wrap-around caching.

In an effort to improve performance, most hard disk manufacturers today have implemented enhancements to their cache management circuitry, particularly on high-end SCSI drives:

  • Adaptive Segmentation: Conventional caches are chopped into a number of equal-sized segments. Since requests can be made for data blocks of different sizes, this can lead to some of the cache's storage in some segments being "left over" and hence wasted (in exactly the same way that slack results in waste in the FAT file system). Many newer drives dynamically resize the segments based on how much space is required for each access, to ensure greater utilization. It can also change the number of segments. This is more complex to handle than fixed-size segments, and it can result in waste itself if the space isn't managed properly.
  • Pre-Fetch: The drive's cache logic, based on analyzing access and usage patterns of the drive, attempts to load into part of the cache data that has not been requested yet but that it anticipates will be requested soon. Usually, this means loading additional data beyond that which was just read from the disk, since it is statistically more likely to be requested next. When done correctly, this will improve performance to some degree.
  • User Control: High-end drives have implemented a set of commands that allows the user detailed control of the drive cache's operation. This includes letting the user enable or disable caching, set the size of segments, turn on or off adaptive segmentation and pre-fetch, and so on.

While obviously improving performance, the limitations of the internal buffer should be fairly obvious. For starters, it helps very little if you are doing a lot of random accesses to data in different parts of the disk, because if the disk has not loaded a piece of data recently in the past, it won't be in the cache. The buffer is also of little help if you are reading a large amount of data from the disk, because normally it is pretty small: if copying a 10 MiB file for example, on a typical disk with a 512 kiB buffer, at most 5% of the file could be in the buffer: the rest must be read from the disk itself.

Due to these limitations, the cache doesn't have as much of an impact on overall system performance as you might think. How much it helps depends on its size to some extent, but at least as much on the intelligence of its circuitry; just like the hard disk's logic overall. And just like the logic overall, it's hard to determine in many cases exactly what the cache logic on a given drive is like.

Next: Cache Size


Home  -  Search  -  Topics  -  Up

The PC Guide (http://www.PCGuide.com)
Site Version: 2.2.0 - Version Date: April 17, 2001
© Copyright 1997-2004 Charles M. Kozierok. All Rights Reserved.

Not responsible for any loss resulting from the use of this site.
Please read the Site Guide before using this material.