Thursday, January 31, 2008

Power pinches everybody

I’ve heard a hat trick of power problems this week. In 10G Ethernet, signaling and modeling—everywhere you turn—power is increasingly a key constraint.

A chief technologist at Solarflare wrote a somewhat rambling treatise that concludes there is no power budget for running TCP offload on 10Gbit Ethernet chips for the foreseeable future. That’s a somewhat self-serving conclusion, but at least he was specific about the numbers.

TOE-enabled chips will consume about 16W in 130nm or 10W in 65nm—two hot for 2008-class dual-ported products, he said. By contrast, chips that push the TCP stack to a host CPU will dissipate 5 and 4W respectively in 130 and 65nm technology, he said. If anybody wants to take issue with those numbers, let’s have at it. iWarp people, sound off!

Separately, HP fellow Terry Morris said the DesignCon organizers spent much of their time trying to figure out how to cover the hot topic of co-design to handle the increasing merger of power, signaling and timing issues for chip, board and systems engineers. Power effects at multi-gigabit speeds have to be part of the design analysis and there are no standard tools to do this, he said. EDA vendors, whatcha got to say about those beans?

Last but not least I heard Grant Martin, chief scientist at Tensilica, today say that one of the big issues in the new virtual prototyping style of design is a lack of energy models.

"We are much farther behind in this area," Martin said. "We have done some things with energy modeling for our devices at Tensilica, but you really need to know about the energy models for other devices in a system—so we need lots of friends," he added.

OK, so go sign up as friends at Grant’s MySpace page, and let’s get an industry consortium going on this issue, too.


Anonymous said...

I totally agree with the SolarFlare guy about TOE. It's a point-in-time solution whose time has passed. TSO + LRO + RSS (ie stateless offloads) saturate 10 GbE links already. It would be even easier if Intel would stop to play bad monopoly again and open their IOAT secret interface (cache injection for example). Rick, could you ask them why they reserve it for their own 10GbE chip ?

However, I disagree about the baseT interface, it's the same argument than for TOE: why should I pay for it (power, space, time) when I don't need it ? Copper for KX4 and CX4, QSFP for optical. Best price, length, form factor.

Looks like the best choice is no TOE and no baseT, sorry SolarFlare.


Anonymous said...

Protocol on-load vs. off-load borders on a religous debate and is often biased by business objectives rather than acknowleging that the problem space is much larger and complex than either side will honestly debate. Even simple questions of power are distorted here. Is a 10W device more power efficient than one or more processor cores being consumed to run a network stack? Is 10W really too much power for an I/O device period? Are the support issues around any protocol off-load (and keep in mind that this isn't just about TOE and iWARP but about an entire range of protocol off-load devices which are never really discussed but are all predicated on the same paradigm) too complex or cumbersome to solve? Is the software ecosystem simply unable to develop a TOE stack that works well? Certainly TOE is being supported via iWARP on the open source OFA effort so there must be something that works.

I see merit in protocol on-load as well as off-load. Both have their place and over time their needs will evolve. To state one is DOA or one is always going to win is absurd and speaks more of the bias that plagues solid engineering.

As for copper vs. optics, well, that has a number of issues as well and can't be simply answered. The optics crowd talks about how great the power savings are but they have yet to produce cost effective optics at 10 Gbps speeds - at least nothing a volume platform provider can use. The copper guys focus on hitting longer distances intead of recognizing that many deployments don't require 100m support and therefore don't need to have as complex, power hungry and what some might argue as expensive of parts. Both should avoid repeating the mistakes 2000-2002 but it is unclear if they are listening. Drive prices down, drive power down and you might have a chance to create real volumes.

In any case, please continue the regilous debate and maintain the sound-bite tit-for-tat that is really a plague on technology discourse. The bias will end up doing more harm than good for many companies.