Questions from SAS 201 Webinar Answered
In an effort to provide ongoing educational content to the industry, the SCSI Trade Association (STA) tackled the basics of Serial Attached SCSI (SAS) in a webinar titled “SAS 201: An Introduction to Enterprise Features,” now available on the STA YouTube channel here.
Immediately following their presentations, our experts Rick Kutcipal of Broadcom and Tim Symons of Microchip Technology held a Q&A session. In this blog, we’ve captured the questions asked and answers given to provide you with insight on the recent evolutions in SAS enterprise features including 24G SAS and what they mean to system designers.
Q1. Do you think 24G SAS will only be applicable to SSDs, and HDDs will remain at 12Gb/s?
A. Rick: At this time, hard disk drives (HDDs) can’t take advantage of the bandwidth that’s available in 24G SAS. And right now, the technology itself is focused on the backbone and then solid-state drive (SSD) connectivity. Currently, that’s the way we see it shaping up.
Tim: If we go back about eight years or so, someone asked me the same type of question when we went from 3Gb/s SAS to 6Gb/s SAS, and the answer was “the platters don’t get data off that quickly.” Well, look where we are now.
Q2. How does SAS-4 deal with the higher value AC block capacitor specified in U.3?
A2. Tim: This is really getting into the details. U.3 allows you to interconnect SAS devices and PCI devices in a similar backplane environment. All SAS devices are AC coupled so you’ve got a capacitor that sits between the transmitter and receiver. The value is different between different technologies. However, what we did for SAS, and it’s common for a lot of receivers, we changed the blocking AC capacitor values – de-rated them. This does not have a very significant effect on the signal. Consequently, we’re able to accommodate multiple technologies changing AC capacitor value without having significant change in the error correction. So, if you have a look at a U.3 specification, you’ll see a slightly different capacitor value than is specified in the SAS environment. However, that has been endorsed by SAS and does not have any impact on it.
Q3. To achieve 18″ trace on backplane + 6m cables, what budget was assigned to the host adapter, and the media?
A3. Tim: In SAS, we don’t assign particular budgets to particular parts of their subsystem. In the back of the SAS specification you’ll find an example in the Appendix. They call out, “What if we had 4.0 dB loss in the disk drive and 2.5-3.0 dB loss on the host before we got to the cable?” But those are just examples, they’re not requirements.
Essentially, the channel end-to-end from transmitter to receiver is a 30 dB loss channel and how you use that is really up to you. Sometimes, when the disk drive is very close to your host, you may actually choose to use that budget in perhaps a lower cost material, and you’ll have a 30 dB loss channel in a 12-inch connection. SAS is very flexible in that nature, so we don’t assign specific budget to any specific portion of the channel.
Q4. How do you see 24G SAS and x4 NVMe Gen 5 drives co-existing?
A4. Tim: Speaking from the 24G side of an array, disk drives themselves can have multi-links on them. Because of the x4 nature of an NVMe x4, that just gives you more bandwidth on a Gen5 system. Gen5 is 32 Gbps on NVMe, whereas in SAS, we’re looking at 24 Gbps. It’s quite reasonable technically to add x4 links to that. So, the technology and the bandwidth is pretty similar between the two.
Rick: The one thing that Tim just went through was some of the investments and improvements that we have made in 24G SAS to account for these higher data rates, so there will be a difference in the reliability of those particular devices. In general, they will coexist, probably targeting slightly different market spaces.
Q5. In large Gen 2 or Gen 3 SAS domains, the end devices can suffer with response time issues due to retiming delays introduced by the expanders in the path. Is Gen 4 looking at ways to reduce these delays?
A5. Tim: That’s a great question about fairness enhancements. So, the observation is that as you add an expander and daisy chain to other expanders, when a transaction says, “I finished this transaction,” the first expander tells all its attached devices, “Hey, you’ve got some available bandwidth.” What can happen in a very heavily loaded congested system is that the device closest to the host gets serviced first. So, what we did in SAS-4 and SPL-4 specifically, and beyond, was add fairness enhancements such that you don’t just say a device is waiting for available bandwidth or an available transaction.
Each request comes with an age. That ensures that it doesn’t matter where you are in that infrastructure. You will get a fair crack at getting bandwidth as it frees up. So, that is a change from Gen 3 and Gen 4 and it becomes more prevalent as you go to higher performance because you’re attaching more devices and you’re sharing that bandwidth between more devices. As a result, we’re seeing it become more impactful at that rate.
Q6. Could you explain a little bit more about the need for Forward Error Correction?
A6. Tim: At 12Gbs SAS and previous generations, 6Gb/s and earlier, the noise characteristics of a transmitter to a receiver and also transmissions through a PCB were more affected by crosstalk and reflections. Whereas as we go up in higher frequency ranges of 24G, we do get more disruption to the channel. So, the real need for forward error correction was, we would go down to one-third lengths of cables and one-third the lengths of PCB traces if we didn’t have it. We’d also have to require quite exotic materials, Megtron 10 and beyond. And also, we’d probably have to change our interconnect to all the disk drives as well to support 24G SAS. It would have been quite a disruption.
We needed a technology that could give us the data integrity and data delivery of valid uncorrupted data. That’s why we turned to forward error correction. It has been proven in other technologies, such as Ethernet, which has had it for quite a while. So, we weren’t reinventing the wheel, but what we were doing was taking a concept successfully being used in other technologies and applying it to SAS. As a result, we were able to keep the latency low, channel costs down, and continue to support the same ecosystem requirements of six-meter cables and backplanes.
Q7. What is the outlook for HDDs, given the ongoing acceptance of SSDs in the enterprise?
A7. Rick: In my section of the presentation, I did talk a lot about HDDs, and SSDs are gaining quite a bit of market share. However, they represent different needs. In my examples, I showed warm tiers and cold tiers of storage and how important dollar-per-gigabyte is in those particular applications. And that’s all serviced by HDDs today. For the foreseeable future, a lot of those innovations we talked about during the presentation are optimizing for capacity. And so right now there still is a sizeable gap between the dollar-per-gigabyte on equivalent SSDs as compared to HDDs. Will it always be that way? Probably not. But for the foreseeable future, HDDs are going to play a very important role in enterprise storage.
Tim: When comparing HDDs and SSDs, we talked about warm storage, cold storage, and intermediate and hot storage. For rotating media, that’s one performance level. For SSDs, it’s a slightly different performance level, and this is why we’re seeing NVMe work hand-in-hand with SAS. They don’t replace each other because they have different performance characteristics. Disk drives are still by a long way, the best cost-per-gigabyte of storage or cost-per-terabyte storage. In large cold storage systems, that’s required.
Q8. In terms of scalability, how large of a topology is possible?
A8. Rick: For SAS, it’s some unrealistic number like 64K. More practical cases are being limited by the route cable in the expanders where we’re seeing it at just north of a thousand connected devices.
Tim: You may break it into segments so you have total accessibility to tens of thousands of drives, but you really only want hundreds to up to a thousand per regional zone just to get your bandwidth.
Q9. Can you comment on the performance implications of Shingled Magnetic Recording?
A9. Rick: During the presentation, I talked about Shingled Magnetic Recording (SMR) and what we’re doing in T10 to support it with the Zoned Block Commands, etc. And in the press, SMR has gotten significant feedback on performance. The important part is you have to understand the type of SMR that’s being used. In the enterprise, it’s all host-managed SMR. So that means the OS or the application manage the zone allocations and the streaming of data to make sure that you’re dealing with the shingles, the overlapping tracks, correctly. In drive-managed SMR, this is all managed in the drive and that can have performance implications, but that technology is not used in the enterprise.