Tuesday, September 11, 2007

I/O goes virtual

Serag GadelRab, author of the IEEE Micro article on 10Gbit Ethernet that I bogged about last week, was right about one thing. Virtualization is becoming an increasingly important part of the differentiation in both processors and I/O chips.

AMD rolled out this week the Rapid Virtualization Indexing feature built into its Barcelona chip. It boosts virtualization performance as much as 25 percent when apps support it, AMD claims. Intel is right on top of this trend too.

I learned from Intel yesterday its I/O Acceleration Technology (IOAT) effort is now coming under the umbrella of a larger group of technology initiatives called Virtualization Technology Connectivity (VTC, to keep the acronyms going). Although Intel rolled out a few new features as part of IOAT this year so far, it sounds like a lot of the juice in the future is coming from its work supporting virtualization in hardware.

Specifically, at the Intel Developer Forum the company will be talking about Virtual Machine Device Queues. These VMDQs (to stir the alphabet soup) will reportedly help boost throughput on virtual traffic from 4 to 10 Gbits/s, providing they are supported by software vendors. VMWare is on board this particular advance from Intel which it simply calls Net Queues.

The VMWares of the world are no doubt getting stretched every which way as the major processor and Ethernet companies try to get them to support special hardware hooks they are baking into their parts. This is to say nothing of the PCI I/O virtualization work which is nearing completion and already getting put into silicon.

And coming soon to a server near you, 10GBase-T products from Intel. Stay tuned!

4 comments:

Anonymous said...

Hmmmm...if you call a rose by any other name, is it still a rose? If you call creating multiple work queues with independent interrupt vectors by any other name such as VMDQ, is this still not the same fundamental concept? If you add multiple DMA engines and allow their assignment to single or multiple queues is that something radically different?

Multi-queue technology isn't novel. It has been used in a wide range of I/O solutions for many years. Tying in independent interrupt vectors isn't new either. The combination was demonstrated many years ago by a major server company and IHV using one of the first 10 Gbps Ethernet proof of concepts using what would become the basis for PCI MSI-X.

Further, this approach was the basis for InfiniBand and iWARP which were both designed from the start to support virtualization solutions. What people purport as novel was in fact conceived and baked into InfiniBand back in the days of Future I/O by the original architects. Everything can be found there including the basis for dynamic OS guest migration support, transparent fail-over, load-balancing, QoS controls, etc. All of this experience has been brought forward into other technologies including the PCI-SIG's IOV work.

So, what exactly is new from any of these companies? They are putting these concepts into their products and spinning them as best they can to differentiate from one another. However, their core principles are basically the same. They may have a nuance here and there that needs to be covered by a bit of software - gotta have that value-add somewhere in a commodity world - but they are all moving down the same path originally pioneered by others who were looking to support scalable SMP solutions in conjunction with strong virtualization.

Call it by whatever name you choose but the basic concepts are all the same - separate resource sets that can be dynamically assigned to guest OS with a bit of central management and paravirtualization / configuration hooks to deal with the device-specific / implementation-specific functionailty. It has taken a couple of years to get everyone on the same page but they are moving in the same direction. I'm sure they will all claim some novel feature here and there but they are all variations on themes and demonstrated technologies and shipping products that have been around for years.

A rose is a rose is a rose. No matter what marketing spins, that is true for all of these supposedly novel virtualization techniques. No offense to those bringing out their own flavors of the technology but they are really executing at best variations on ideas and themes developed and shipped by others for at least 10+ years now.

Anonymous said...

Hmmmm...if you call a rose by any other name, is it still a rose? If you call creating multiple work queues with independent interrupt vectors by any other name such as VMDQ, is this still not the same fundamental concept? If you add multiple DMA engines and allow their assignment to single or multiple queues is that something radically different? Is tying any of this into an IOMMU different than what has been done on other interconnects which combine such technologies today really that different other than it is integrated into a chipset or perhaps eventually into a processor?

Multi-queue technology isn't novel. It has been used in a wide range of I/O solutions for many years. Tying in independent interrupt vectors isn't new either. The combination was demonstrated many years ago by a major server company and IHV using one of the first 10 Gbps Ethernet proof of concepts using what would become the basis for PCI MSI-X.

Further, this approach was the basis for InfiniBand and iWARP which were both designed from the start to support virtualization solutions. What people purport as novel was in fact conceived and baked into InfiniBand back in the days of Future I/O by the original architects. Everything can be found there including the basis for dynamic OS guest migration support, transparent fail-over, load-balancing, QoS controls, etc. All of this experience has been brought forward into other technologies including the PCI-SIG's IOV work.

So, what exactly is new from any of these companies? They are putting these concepts into their products and spinning them as best they can to differentiate from one another. However, their core principles are basically the same. They may have a nuance here and there that needs to be covered by a bit of software - gotta have that value-add somewhere in a commodity world - but they are all moving down the same path originally pioneered by others who were looking to support scalable SMP solutions in conjunction with strong virtualization.

Call it by whatever name you choose but the basic concepts are all the same - separate resource sets that can be dynamically assigned to guest OS with a bit of central management and paravirtualization / configuration hooks to deal with the device-specific / implementation-specific functionality. It has taken a couple of years to get everyone on the same page but they are moving in the same direction. I'm sure they will all claim some novel feature here and there but they are all variations on themes and demonstrated technologies and shipping products that have been around for years.

A rose is a rose is a rose. No matter what marketing spins, that is true for all of these supposedly novel virtualization techniques. No offense to those bringing out their own flavors of the technology but they are really executing at best variations on ideas and themes developed and shipped by others for at least 10+ years now.

RapidIO Executive Director said...

Well it would be interesting to know who this yet another 'Anonymous Guy/Gal' is. Sounds like somebody I might know from the years between working on PCI And Infiniband. Because I agree, its all about getting down behind the hype of the marketing of IOAT and PCI IOV, as an I/O guy looks clear to me. I see lots of examples of these basic ideas being used by many technologies, old and new, processors and DSP's. Not to mention my associations current technology, which is ccNUMA SMP, has multiple OS capability and protected domains/ resources and is scalable in all dimensions. The interesting part of these developments of standard ways to do this in old technology like PCI is that you got to do it without breaking the legacy code and hardware. So in that case it is 'new', but at what cost to performance and code size? Too often these add a patched technologies create huge issues, and if it where not for the 'assumed leverage' of the available knowledge base and Eco-system of parts, it would not be considered.
Question is: When do you break free of legacy and go with new bottoms up technology focused a the problem your solving VS. driving the old car with the solar panel on the roof, and the trunk full or batteries?
Tom Cox

Anonymous said...

Is it any surprise that virtualization providers basically virtualize the truth when it comes to spinning their product and technology messages? Virtualization providers claim there is no OS modification to run their solutions when in fact, they what they really mean to say is they don't modify the OS any more than a standard I/O driver modifies the OS. These solutions basically work by substituting an Ethernet or a SCSI or a whatever device driver delivered by the virtualization provider. These virtualized device drivers then provide the needed "hook" to intercept I/O device hardware interactions and provide their secret sauce value-add. If being honest, these components combined with a bit of memory traps for PCI configuration operations constitute a paravirtualized solution.

By using paravirtualization, a virtualization provider can interpose and take any variety of actions on behalf of the OS. VMware's technology such as OS guest migration, reflection of I/O operations to multiple hardware platforms, etc. are all founded upon the ability to intercept any operation and replay or migrate its state and results when and where it likes. This isn't rocket science by any means but really one of paying attention to every detail of how an OS and I/O device operate. The PCI aspects are perhaps the simplest to deal with since the are based on well-defined configuration registers, address ranges, behavioral models, and so forth. Since the PCI hierarchy is first mapped by the hypervisor, it knows where everything is and can control what I/O is seen by each OS guest. It can also trap any address range that might be accessed by a guest or application and take steps to prevent, coordinate, or allow direct access if it chooses. The PCI-SIG I/O virtualization work is focused on specifying these accesses further so that the hypervisor does not need to be involved in each main data path operation while keeping the configuration operations well-defined and generic.

One can argue that the PCI-SIG's work could have been significantly streamlined since a large part of the work is essentially defining what is a multi-queue model with independent interrupt vectors per queue set that is provided to each guest. Just associating a unique function identifier with each queue set would have allowed the everything to work with the new IOMMU / VTd chipsets. Then all that would be needed is to take that well-known software emulation driver and modify it a bit here and there and be done with it. The proof that this concept works has already been shipping in the market for a couple of years via the 10 Gbps Ethernet device provided by at least one vendor called Neterion which has been shipping multi-queue devices with MSI-X style interrupts since some of its founding engineers demonstrated that technology in the late 90's.

So, is anything new really being done here with multi-queue technology? Is not the legacy argument largely just a shell game since software is modified in many areas in the stack so the rose is still a rose albeit it is more of a patented hybrid than the pure rose it is purported to be? Yes, great strides have been made to get this to all work but in reality, all that is being done is to take the same techniques pioneered by others and in use for years and simply re-spin them. Not that doing so isn't a major accomplishment. I mean, they are basically delivering the same value propositions and services available for decades on mainframes now on commodity systems; a feat, that would not be practical or possible without the major strides in processor performance which have left so many servers operating at 10-15% utilization (a number that may go down as core counts go up). Virtualization solutions that deliver the mainframe or fault-tolerant experience are mainly possible due to the advent of "free" processor cycles to spend on its overhead combined with advances in high-speed I/O device design and operation. VMware and Intel didn't invent anything fundamentally new here; they are simply taking prior work and respinning it here and there. Nothing wrong with that but it is disingenuous to portray it as dramatic other than perhaps it can now be done on commodity systems due to advances provided by others.

 
interconnects