AWS moves to accelerate custom AI silicon with NVLink Fusion

A major cloud provider adopts a rack-scale platform to streamline deployment of advanced AI systems, aiming to cut complexity while pushing new compute architectures to market faster.

author-image
DQC Bureau
New Update
AWS moves to accelerate custom AI silicon with NVLink Fusion

AWS moves to accelerate custom AI silicon with NVLink Fusion

The race to build larger AI models is forcing hyperscalers to rethink how they design and deploy compute infrastructure. The latest step in that shift comes as Amazon Web Services collaborates with Nvidia to integrate with Nvidia NVLink Fusion, a rack-scale platform meant to simplify and speed up custom AI silicon rollouts.

According to the information provided, the collaboration focuses on AWS bringing its next-generation Trainium4 chips, Graviton CPUs, Elastic Fabric Adapters and the Nitro System into the NVLink Fusion ecosystem. AWS is also aligning Trainium4 to work with NVLink 6 and the Nvidia MGX rack architecture, marking the first phase of a multigenerational partnership.

This move signals a larger ambition: reducing development cycles for highly specialised AI racks and managing the growing complexities of supplier ecosystems.

The rising challenge of custom AI silicon

AI workloads are expanding faster than most infrastructure teams can keep up with. Models now run into the hundreds of billions of parameters. Newer workloads—planning, reasoning and agentic systems—push clusters harder, relying on many accelerators running in parallel and connected through a unified fabric.

To meet these demands:

  • Hyperscalers must design scale-up networks for entire racks.

  • They need scale-out and storage networking.

  • Racks must include advanced cooling, power delivery and system management.

  • They must also integrate AI acceleration software tuned for new chips.

The cost is steep. Development cycles stretch into years and run into billions of dollars. Managing suppliers is equally intense. A single delay in components—busbars, trays, cold plates, power shelves or coolant distribution units—can halt entire deployments.

NVLink Fusion aims to address these fault lines.

A unified rack-scale platform

NVLink Fusion provides a foundation for hyperscalers building custom ASICs. At its core is the NVLink Fusion chiplet, which can be integrated directly into custom silicon. This allows new AI chips to connect to the NVLink scale-up interconnect and NVLink Switch.

The broader portfolio includes:

  • The Vera-Rubin NVLink Switch tray

  • The sixth-generation NVLink Switch with 400G custom SerDes

  • Support for connecting up to 72 custom ASICs in an all-to-all configuration

Each ASIC can access 3.6 TB/s of scale-up bandwidth, a total of 260 TB/s for the domain.

The infrastructure also enables peer-to-peer memory access with direct loads, stores and atomic operations, alongside Nvidia’s SHARP protocol for in-network reductions and multicast acceleration.

Unlike emerging approaches, NVLink is already established and widely deployed. Within this setup, Nvidia claims that connecting 72 accelerators in a single scale-up domain can deliver up to 3x performance and revenue for AI inference compared with other approaches.

Cutting development costs and risks

NVLink Fusion adopters gain access to a modular technology stack built around Nvidia MGX architecture. This includes GPUs, Vera CPUs, co-packaged optics switches, ConnectX SuperNICs, BlueField DPUs and Mission Control software. It also links ASIC designers, CPU and IP providers, and manufacturers.

For AWS, this ecosystem is particularly useful. The company can draw from original equipment manufacturers and original design manufacturers that already supply complete rack-scale components: racks, chassis, power delivery and cooling solutions.

This reduces supply chain risks, long development cycles and integration delays, issues that have traditionally slowed down custom AI silicon programmes.

Building heterogeneous AI silicon in a single rack

One standout element of NVLink Fusion is support for heterogeneous silicon. This lets AWS deploy different types of chips - Trainium4, Graviton CPUs and potentially other accelerators, within the same AI factory footprint. Power distribution, cooling systems and rack layouts remain consistent.

The platform is flexible. Adopters can use only the parts they need or embrace the full design, scaling up incrementally for training and inference workloads.

What the collaboration signals

Bringing custom AI chips to market remains a complex undertaking. The collaboration between AWS and Nvidia shows how hyperscalers are beginning to lean on shared architectures rather than designing every component from scratch.

By anchoring Trainium4 deployments on NVLink Fusion, AWS intends to shorten innovation cycles and accelerate time to market. For the broader industry, the move reflects a shift toward modular, proven rack-scale systems designed for the next wave of AI growth.

The infrastructure race is no longer just about the chip. It is now about how fast those chips can be deployed safely, consistently and at scale.

 

Read More:

CrowdStrike on empowering India’s channel partners for cybersecurity’s future

Quick Heal version 26: anti-fraud, dark-web monitoring and partner growth

How Confluent enables partner growth through developer education & AI integration

GPT-5.1, a new chapter in Developer AI with agentic capabilities

Advertisment