An autonomous, AI-enabled server with zero planned downtime

New innovations across the entire stack, from silicon to applications, offer unique added value and improved business continuity. IBM upgrades its range of Unix servers to the Power11 generation.

At first glance, there are few changes compared with the Power10 servers they replace: the format, number of processors and amount of RAM are the same. On the other hand, the options and functions are more numerous.

For example, the number of usable cores per processor rises to 30 (20 or 24 on previous models). These new machines can also be equipped with Spyre accelerator cards for AI inference. These are the cards that IBM already offers on its latest generation of mainframes, the z17.

The mainframe heritage is obvious. IBM capitalizes on its historical know-how, boasting an exceptional availability rate of 99.9999%, or less than 32 seconds of downtime per year.

Spare cores

With Power11 processor-based servers, customers can expect zero planned downtime for system maintenance, thanks to new technologies such as autonomous patching and automated workload movement with Live Partition Mobility. In addition, IBM is integrating reserve core technology (already seen on mainframes) into its Power11s, enabling servers to continue operating even in the event of a CPU core failure, thanks to automatically activated “standby” cores.

A NIST-certified Power Cyber Vault device promises to detect and block malware in less than a minute. What’s more, the self-diagnosis system is now supported by AI, making maintenance more proactive and reducing downtime accordingly.

Spyre, for the end of the year

Unsurprisingly, IBM’s marketing emphasizes the ability of its new machines to drive AI workloads. Here, the emphasis is on inference rather than learning. The Power11 processors feature the same matrix acceleration MMA unit as the Power10. It’s not a full-fledged NPU, but it comes close. In practice, each Power11 core actually embeds 4 MMA units. IBM does not give TOPS performance measurements for these MMAs. But it does emphasize that TOPS are only part of the story, and that AI processing benefits above all from the Power11’s impressive memory bandwidth: 1200 GB/sec!

As for the Spyre gas pedal, it will only be available as an option at the end of this year. This chip includes 32-core mainframes designed to execute 4, 8 or 16-bit matrix and vector functions very rapidly. These functions are needed to draw response elements from an LLM, according to the numerical values of tokens taken from the user’s prompt. Power is rated at 300 TOPS, around three times better than chips combining NPU and GPU on AI-dedicated workstations.