Friday, May 18, 2007

NUMA for the masses

Microsoft is upping the ante in its support for non-uniform memory access (NUMA) systems. Windows has provided support for NUMA in the way it schedules threads and manages memory. Now it is creating in its Windows Server 2008 operating system a new storage I/O capability for NUMA.

Two sessions at this week's Windows Hardware Engineering Conference discussed the deeper focus on NUMA in Windows Server. I asked Bruce Worthington, a development leader in the Windows Server group why Microsoft is putting more emphasis in this area.

He said the rise of multicore processors with shared caches is one of the big motivators. Another is Intel's plans to roll out its Common System Interface. CSI will act like AMD's HyperTransport to open the door to glueless multiprocessing systems in which disk reads and writes must sometimes traverse multiple CPUs, something the NUMA software could simplify.

NUMA used to be the domain of big systems like those made by Sequent Computer. But in the shrinking world of electronics, now NUMA is ramping up in fairly mainstream PC servers, and before long it will be baked in to multicore CPUs.

1 comment:

Anonymous said...

Once the industry moved to integrated memory controllers, three latency domains became interesting: core-to-integrated controller, core-to-remote controller, and core-to-core external to the backplane which may be either coherent or non-coherent traffic. All of the OS and communication / network stacks need to be designed to achieve data locality and application execution locality to maintain reasonable performance without wasting significant processor cycles. NUMA isn't hard at the most basic levels. However, achieving the necessary data / execution locality can be quite complex. To date, volume OS have not focused on these types of problems. Niche OS such as Unix have done some but nothing that significant because as you extend this out to its logical conclusion which starts to take into account other components such as storage, security, etc. the locality problem can be significant.