Next Platform TV for Monday, June 29

Today we look at an evaluation of AMD versus Nvidia GPUs for HPC applications, we also consider infrastructure for AI in production drug discovery, and in a completely different direction, we talk to Danny Shapiro, head of automotive at Nvidia about the datacenter requirements for future autonomous vehicles. Also on the program we feature insight about the state of NVMe adoption with Lightbits Labs and we end the show with some level-setting from Ampere Computing.

With the costs of drug discovery mounting and time to result more critical than ever, life sciences companies are tasked with some tough decision making across the board and certainly when it comes to the compute and storage infrastructure. All of this more important than ever, of course given the current pandemic. To talk about this we’re joined by Kris Howard, Principal Systems Engineer at digital drug discovery company, Recursion. We talk about using HPC parallel file systems like Lustre for AI and the management challenges of a GPU dense production system from a storage/file system perspective.

In the HPC vein we switch gears to talk about comparisons in hardware and software performance for sparse linear algebra among the two main GPU makers (Nvidia and more recently—at least for real these days—AMD). We look at the criteria and relative performance and usability of both software stacks as well.

On this episode we also talk with Danny Shapiro, head of automotive at Nvidia about the backend compute and datacenter requirements for next generation autonomous cars. While all the processing that happens on board is interesting, there’s a fair bit of training (and retraining) not to mention the overall task of making self-driving vehicles more intelligent and capable. We review the datacenter requirements with Shapiro and talk about what the future holds.

We also talked to Josh Goldenhar at Lightbits Labs, one of the NVMe-Over-Fabrics storage upstarts, and in this case, the one who implemented NVMe over the TCP stack, no RDMA, and who focused on block storage. Lightbits just put out the second release of its LightOS disaggregated storage stack, which among other things includes scale-out and multipathing for high availability. We chatted about the number of different ways that people are consuming LightOS – on their iron, on appliances, or with FPGA acceleration – and how they are using it to support old and new database technologies, among other things.

To close the episode we take another look at Ampere Computing, which have already profiled in long form, to get a handle on what’s happening now that they’ve formally entered the market and what they might get right that others before them have missed out on.

All of this and more on today’s program. Thanks for joining.

Cheat Sheet

2:08 Drug Discovery and the Compute/Storage Requirements with Kris Howard, Systems Lead, Recursion Pharmaceutical

11:16 NVMe Momentum in Enterprise from the Lightbits Labs Perspective with Josh Goldenhar

22:40 The Datacenter and Compute Backend for Autonomous Cars with Nvidia’s Director of Automotive, Danny Shapiro

32:02 AMD Verus Nvidia GPUs for Sparse Linear Algebra and HPC with Hartwig Antz.

37:26 Checking in with Ampere Computing, with Jeff Wittich, including the next-gene 128-core Siryn chip.

Read original article: