Serial Storage Wire » Reliability Archives


By Paul Vogt, Director of Product Marketing,
Adaptec

The flexibility of Serial Attached SCSI (SAS) allows it to be used as a universal connectivity standard for a wide range of business needs. In addition to its role as a replacement for parallel SCSI, SAS now offers the ability to reduce the cost of servers and storage networks, to create tiered solutions that weren't possible before, and to provide investment protection for SATA installations. While SAS performance and scalability generate a majority of the media attention, it could be the flexibility that SAS brings to storage subsystems that creates the most value for your business.

SAS as a parallel SCSI replacement
SAS maintains support for the proven SCSI command set while offering much better performance, scalability, and availability. Stability and reliability have made parallel SCSI the standard in storage connectivity for twenty five years. SAS offers compatibility with this command set while overcoming the physical limitations of parallel SCSI. Now with SAS, it is possible to take advantage of dual-port drives, redundant connections, failover support and scalability up to 128 attached devices and over 16,000 addressable devices. Performance at 3Gb/s per port can be aggregated into wide-port connection bandwidth.

Author: Paul Griffith
Adaptec, Inc.

Why Choose SAS?
The topmost factor in choosing a storage subsystem for enterprise system development is reliability. Maintaining user access to valuable data reduces total IT costs and increases user productivity. Serial Attached SCSI, a serial bus architecture, has emerged to deliver higher levels of reliability than parallel SCSI for mission-critical, transactional applications that must be online around the clock with no data loss.

To ensure continuous data access when a disk drive fails, multiple initiators have long been used in enterprise computing to provide disk drive access to multiple hosts and host bus adapters or both - an approach that doesn't work well in parallel technology configurations because doing so produces single points of failure that can block access to a device and ultimately to critical data.

A developer benefits from utilizing serial bus architectures that overcome this shortcoming in reliability by supporting a network of dedicated point-to-point device connections and eliminating the single point of failure. The connections also provide full bandwidth to each storage device to boost system performance. By contrast, multi-drop parallel bus architectures share total bandwidth among devices.

SAS: Reliable from the Start

| No Comments | No TrackBacks

Author: Harry Mason
LSI Logic

An often over-looked aspect of system reliability is software quality. Software quality improves as the software is used in real world applications and as it undergoes revisions, until the failure rates are quite low. In short, software run time is a necessary requirement to deliver highly reliable enterprise-class storage solutions.

Serial Attached SCSI (SAS) was architected from the beginning to leverage more than two decades of legacy SCSI software. As such, the middleware applications that have supported SCSI over this lengthy period of time can be reused without sacrificing any enterprise-proven functionality.

Reliability and Availability

| No Comments | No TrackBacks

Author: Martin Czekalski
Maxtor Corporation

Reliability, availability, and scalability are the cornerstones of online transaction processing (OLTP) and enterprise class computing systems and storage subsystems. In this issue of Serial Storage Wire we will cover the first two as they apply to Serial Attached SCSI (SAS) and cover scalability in the next issue.

Reliability
The terms reliability and availability are often misunderstood and used interchangeably, when in fact they refer to two different attributes of a system or its components. For example, a component's reliability is typically specified as Mean Time Between Failure (MTBF), which represents the mean number of component hours accumulated between failure events when a large sample of components is run in operation. Another important aspect of reliability is the ability to detect errors when they occur and take appropriate action so as not to adversely affect the integrity of a system or its data (error detection and containment).

Availability
Availability, on the other hand, is the percentage of time a system or component is available for use. To calculate availability, additional factors beyond MTBF need to be factored in such as Mean Time to Repair (MTTR) and the overall system architecture. In many cases individual system components can fail while not affecting the availability of a system (e.g. RAID systems can usually remain available, even if a single disk drive fails).