[ The PC Guide | Systems and Components Reference Guide | Hard Disk Drives | Hard Disk
Performance, Quality and Reliability | Redundant Arrays of
Inexpensive Disks (RAID) | RAID Configuration and Implementation
| RAID Management ]
Alarms and Warnings
While the software that comes with RAID controllers will let you check the status of
the array at any time, there are situations where the administrator of the array needs to
know that something has happened, now. Finding out about important bad news
"the next time you run the management utility" just isn't good enough, and
anyone who manages RAID arrays is typically too busy to keep checking for problems all day
long--especially since they occur rarely anyway.
For this reason, controllers usually are programmed to generate alarms and warning
messages when certain problems occur with the controller or the array. On better
controllers these take the form of an audible alarm: loud beeping coming from the
controller card that will certainly make you sit up and take notice, believe me. :^)
Audible notification greatly increases the chances that trouble will be addressed
immediately. There are cases where this feature can be the difference between a hardware
problem being an inconvenience, and being a disaster.
The conditions that will trigger a warning vary from one controller to another, but the
most common ones include these:
- Array Failure: An array connected to the controller has failed due to a
hardware fault. This would occur due to failure of enough drives to compromise the array,
so one failure will do it for RAID 0, two for RAID 1, 3, 4 or 5, and so on. For a multiple
RAID level, the failure of a component "sub-array" will normally trigger an
alert even if the "super-array" continues working. So in RAID 0+1, one RAID 0
array may fail while its mirror carries on, but the failure of the RAID 0 sub-array will
generate an alert. This error condition is sometimes described as the array being
"offline" (which of course it would be.)
- Degraded Mode Operation: An array connected to the controller is
running in a degraded state due to a hardware fault. This warning situation will occur in
a redundant RAID level where a number of drives have failed, but not enough to take the
array offline. The array will continue to run but performance
will be degraded until the fault is corrected and the failed drive is rebuilt. This
alert is arguably the most important one of all, because it's hard not to notice an
outright array failure, but it can be hard to know that an array is still up, but
running in a degraded state.
- Rebuild Completion: If an automatic rebuild of a degraded array is in
progress, the controller may signal when the rebuild is complete. This signals that the
array is no longer in degraded mode, which can be important to know if, for example, a
drive failed and a hot spare was rebuilt in its stead in your
absence (you'll know that the failure occured and that you now need to replace the failed
drive.)
- Controller Hardware Fault: The controller has detected some sort of
internal fault or problem. For example, some controllers monitor their own temperature and
may issue a warning if acceptable limits are exceeded.
In addition to audible alerts, notification of important conditions can usually be sent
over a local area network to an administrator. Controllers that support remote management will of course allow remote
notification as well. In addition, modern controllers also usually support the SMART feature and will report SMART warnings
generated by hard disks that include SMART.
Warning: Some RAID
controllers will let you disable the audible warning feature if you find it too
"annoying". Doing this is like pulling all the batteries out of your smoke
detectors so they won't "irritate you" while you're trying to sleep...
Next: Service, Support and Maintenance
Home - Search
- Topics - Up
|