Arista - Broadcom AI Networking Solution Brief

Solution Highlights

Overview

The rapid advancement of AI necessitates AI data centers that deliver optimal performance for AI workloads, requiring maximized network bandwidth and minimized latency. Broadcom and Arista have collaborated to address these requirements by providing high-performance network hardware and fine-tuning key parameters for end-to-end 400G and 800G AI network solutions. This collaboration focuses on optimizing network performance and TCO with efficient, highly reliable designs implementing robust congestion control, load balancing, and telemetry.

The diagram shows a Tiered Leaf Spine/Plane based design with Arista switches. The top tier consists of four Arista switches, and the bottom tier consists of eight Arista switches, with numerous connections between them, illustrating a high-density, interconnected network architecture.

Arista's portfolio of fixed and modular systems enables clusters of tens of thousands of accelerators, utilizing efficient 2-tier topologies to minimize complexity and costs while maximizing performance and reliability. Arista's commitment to open standards provides customers with maximum choice in accelerators, NICs, and storage, supporting common cluster deployment configurations like Clos topologies.

Features

To meet the high-performance requirements of AI/ML cluster back-end networks, advanced features are essential:

Arista EOS (Extensible Operating System) and Datacenter Switches

Arista EOS delivers a high-bandwidth, low-latency, lossless network scalable to support thousands of XPUs at speeds from 100G to 800G. It enables a premium lossless network through traffic management, adjustable buffer allocation, and support for PFC and DCQCN for RoCE deployments. Arista's DLB and CLB features maximize network forwarding efficiency by minimizing congestion. The Latency Analyzer (LANZ) feature monitors interface congestion and queuing latency, simplifying the configuration of PFC and ECN thresholds for optimal buffer utilization.

Arista Portfolio

Arista Portfolio Product Description
7060X5 and 7060X6 High Density 400G and 800G Fixed Switch Portfolio for AI and DC
7280R and 7800R High Performance 400G and 800G Dynamic Deep Buffer Platforms
7700R4 800G Distributed Etherlink Switch for Accelerated Computing

Broadcom Ethernet NIC Adapters

Broadcom offers a portfolio of Ethernet NIC Adapters with speeds from 1 Gbps to 400 Gbps, delivering high throughput, CPU efficiency, and low workload latency for Ethernet/IP and RoCE traffic. The latest AI-focused adapters, based on the BCM576xx (Thor2) ASIC, support 400GE, 200GE, 100GE, and 25GE in OCP and PCIe form factors. These NICs are optimized for AI applications and support key features like RoCEv2, DCQCN, PFC, and PCC. They are also the lowest-power 400G interfaces available, reducing power and cooling demands. Broadcom's AI NICs support diverse cabling options impacting network power, cost, and reliability.

Broadcom NIC Adapters

OCP3.0 NIC Adapters

Part Number ASIC Ports I/O
BCM957608-N1400GD BCM57608 1x 400G QSFP112-DD
BCM957608-N2200G BCM57608 2x 200G QSFP112

PCIe NIC Adapters

Part Number ASIC Ports I/O
BCM957608-P1400GD BCM57608 1x 400G QSFP112-DD
BCM957608-P2200G BCM57608 2x 200G QSFP112

Cabling and Interconnects

Selecting the appropriate cabling is crucial for data center reliability, power usage, cooling, and cost. Broadcom and Arista offer pre-qualified cabling options for seamless integration:

Cable Distance Power Reliability Cost MPN
Copper Cable (DAC) 5 m Low High Low Amphenol: DJERGN-0003
Active Electrical Cable (AEC) 7 m Medium Medium Medium Credo: CAC82X321A2N-CO-HW
VSR Optical Transceiver 50 m High Low High Switch: Arista OSFP-800G-2VSR4
NIC: Eopotlink EOLQ-854HG-01-M
DR Optical Transceiver 500 m High Low High Switch: Arista OSFP-800G-2XDR4
NIC: Hisense LMQ3621S-PC1
DR Linear Pluggable Optic (LPO) 500 m Medium Medium Medium Switch: Arista LPO-800G-2DR4
NIC: Eoptolink EOLQ-134HG-5H-MSL

Arista CloudVision

Arista CloudVision is a multi-domain management platform simplifying network operations using cloud networking principles. Built on Arista's Network Data Lake (NetDL) architecture, it aggregates enterprise data and uses AI/ML for analysis, insights, updates, and alerts, including predictive insights from Arista Autonomous Virtual Assist (Arista AVA).

Arista AI Agent

The AI Agent integrates NICs and Arista's EOS network operating system for managing and monitoring switches and NIC connections, and debugging server-level issues. The AI Agent and CloudVision work together to provide a unified view of network and server statistics, improving troubleshooting efficiency by correlating network events with server-side issues. CloudVision provides real-time insights, updates, and alerts, and uses AI/ML to identify anomalies.

Summary

Arista and Broadcom are committed to meeting the evolving requirements of AI applications and future workloads. Their partnership delivers a robust, pre-configured solution for a highly scalable 400G or 800G end-to-end optimized network. The solution prioritizes power-efficient and reliable NICs, switches, and interconnects to maximize network availability and accelerator efficiency. This rigorously tested and validated solution ensures rapid deployment for AI workloads.

References

Contact Information

Headquarters

5453 Great America Parkway
Santa Clara, California 95054
408-547-5500

Support

support@arista.com
408-547-5502
866-476-0000

Sales

sales@arista.com
408-547-5501
866-497-0000

PDF preview unavailable. Download the PDF instead.

v1 macOS Version 15.5 (Build 24F74) Quartz PDFContext, AppendMode 1.1 Pages

Related Documents

Preview High-Performance Ethernet Networking for AI Systems: Configuration & Deployment Guide
This guide details the configuration and deployment of high-performance Ethernet networking for Artificial Intelligence (AI) systems, focusing on Arista switches and Broadcom NICs. It covers essential topics such as RDMA over Converged Ethernet (RoCE), Priority Flow Control (PFC), and Explicit Congestion Notification (ECN) to ensure efficient, low-latency data transfer critical for AI/ML workloads.
Preview Arista 7800R3 Universal Spine Platform: Architecture White Paper
Discover the Arista 7800R3 Universal Spine platform, a high-performance modular switch designed for cloud data centers and service providers. This white paper details its architecture, advanced packet processing, 400G capabilities, and Arista EOS.
Preview Arista 클라우드 네트워킹: 스케일링 아웃 데이터센터 네트워크
Arista Networks의 이 백서는 현대 데이터센터를 위한 확장 가능하고 비용 효율적인 클라우드 네트워킹 아키텍처의 구축 및 구현에 대한 접근 방식을 상세히 설명합니다. Arista의 스파인-리프 및 스플라인 네트워크 설계, 개방형 표준 및 유연성을 강조하는 핵심 설계 원칙, 그리고 Arista EOS 운영 체제의 이점을 통해 데이터센터의 성능, 확장성 및 효율성을 최적화하는 방법을 탐구합니다.
Preview Arista 7050X4 Series 100/200/400G Data Center Switches Datasheet
Datasheet for the Arista 7050X4 Series, featuring 100/200/400G data center switches. Details high performance, density, flexibility, and Arista EOS capabilities for modern cloud-native applications and enterprise networks.
Preview Arista VeloCloud Partner FAQs: Acquisition Information
Frequently asked questions for Arista channel partners regarding the acquisition of the VeloCloud SD-WAN portfolio from Broadcom. Details on program changes, transactions, support, and logistics.
Preview Arista 7280R4 Series: High-Performance Data Center Switch Routers
Explore the Arista 7280R4 Series, a line of high-performance, deep-buffered fixed switch routers designed for demanding Cloud, AI/ML, Data Center, and Service Provider networks. Discover features like 800GbE connectivity, advanced routing capabilities, wire-speed encryption, and Arista's extensible operating system.
Preview Arista AI Network Fabric Deployment Guide
A comprehensive guide to deploying an AI Network Fabric using RoCEv2 topology, design, configurations, and key takeaways from a successful proof of concept.
Preview Arista State Streaming: Real-Time Network Visibility for AIOps
Learn how Arista's State Streaming, powered by NetDL Streamer and CloudVision, provides continuous, real-time network telemetry to overcome limitations of traditional polling methods like SNMP, enabling faster anomaly detection and AIOps.