Compare XPUs

Select up to 5 XPUs to compare side-by-side

Select XPUs to Compare

Clear all (5)Jump to results

Filter by Vendor

Showing 128 XPUs • 5 selected

Alibaba

Hanguang 800

AMD

MI100

23.1 TFLOPs

AMD

MI210

181 TFLOPs

AMD

MI250X

383 TFLOPs

AMD

MI300X

1,307 TFLOPs

AMD

MI325X

1,400 TFLOPs

AMD

MI350X

2,100 TFLOPs

AMD

MI355X

5,300 TFLOPs

AMD

Radeon Pro V520

23.04 TFLOPs

AWS

Inferentia2

190 TFLOPs

AWS

Trainium

190 TFLOPs

AWS

Trainium2

680 TFLOPs

Baidu

Kunlun II

Biren Technology

BR100

Cambricon

MLU370

256 TFLOPs

Cerebras

WSE-3

Enflame Technology

CloudBlazer T20

FuriosaAI

RNGD (Renegade)

256 TFLOPs

FuriosaAI

Warboy

Google

TPU v4

275 TFLOPs

Google

TPU v5e

197 TFLOPs

Google

TPU v5p

459 TFLOPs

Graphcore

Bow IPU

Graphcore

IPU-M2000

Groq

LPU Inference Engine

Huawei

Ascend 910B

Iluvatar CoreX

BI-V150

300 TFLOPs

Intel

Data Center GPU Max 1100

177 TFLOPs

Intel

Data Center GPU Max 1550

419 TFLOPs

Intel Habana

Gaudi 2

432 TFLOPs

Intel Habana

Gaudi 3

1,835 TFLOPs

Multi-Metric Comparison

Relative performance across 5 key metrics (normalized to 100 = best in comparison)

Compute Performance (BF16)

Memory Capacity

Power Consumption

Power Efficiency

Specifications

Specification	Groq LPU Inference Engine	NVIDIA RTX 6000 Ada Generation	NVIDIA GeForce GT 710	NVIDIA A100 SXM	NVIDIA RTX A6000
Architecture	TSP (Tensor Streaming Processor)	Ada Lovelace	Kepler	Ampere	Ampere
Form Factor	—	PCIe	PCIe	SXM	PCIe
VRAM	230 GB	48 GB	2 GB	80 GB	48 GB
Memory Bandwidth	—	960 GB/s	14.4 GB/s	2,039 GB/s	768 GB/s
TFLOPs (FP32)	—	91.1	0.366	19.5	38.7
TFLOPs (FP16)	—	—	—	312	—
TFLOPs	—	182.5	0.366	312	77.9
TFLOPs (FP8)	—	—	—	—	—
TDP	300 W	300 W	19 W	400 W	300 W
Launch Date	Feb 2024	Sep 2022	Mar 2014	May 2020	Oct 2020

Efficiency Metrics

Metric	LPU Inference Engine	RTX 6000 Ada Generation	GeForce GT 710	A100 SXM	RTX A6000
TFLOPs per Watt (FP32-eq)	—	0.30	0.02	0.39	0.13
Memory Bandwidth per GB	—	20.0 GB/s	7.2 GB/s	25.5 GB/s	16.0 GB/s

Performance Equivalence

How many units of each GPU are needed to match the performance of the others?

To match 1x Groq LPU Inference Engine

NVIDIA RTX 6000 Ada Generation

VRAM

4.79x

Need 4.79x RTX 6000 Ada Generation

NVIDIA GeForce GT 710

VRAM

115.00x

Need 115.00x GeForce GT 710

NVIDIA A100 SXM

VRAM

2.88x

Need 2.88x A100 SXM

NVIDIA RTX A6000

VRAM

4.79x

Need 4.79x RTX A6000

To match 1x NVIDIA RTX 6000 Ada Generation

Groq LPU Inference Engine

VRAM

0.21x

LPU Inference Engine has 4.79x more

NVIDIA GeForce GT 710

Compute (FP32-eq)

249.32x

Need 249.32x GeForce GT 710

FP32 Compute

248.91x

Need 248.91x GeForce GT 710

VRAM

24.00x

Need 24.00x GeForce GT 710

Memory Bandwidth

66.67x

Need 66.67x GeForce GT 710

NVIDIA A100 SXM

Compute (FP32-eq)

0.58x

A100 SXM is 1.71x faster

FP32 Compute

4.67x

Need 4.67x A100 SXM

VRAM

0.60x

A100 SXM has 1.67x more

Memory Bandwidth

0.47x

A100 SXM has 2.12x more

NVIDIA RTX A6000

Compute (FP32-eq)

2.34x

Need 2.34x RTX A6000

FP32 Compute

2.35x

Need 2.35x RTX A6000

VRAM

1.00x

RTX A6000 has 1.00x more

Memory Bandwidth

1.25x

Need 1.25x RTX A6000

To match 1x NVIDIA GeForce GT 710

Groq LPU Inference Engine

VRAM

0.01x

LPU Inference Engine has 115.00x more

NVIDIA RTX 6000 Ada Generation

Compute (FP32-eq)

0.00x

RTX 6000 Ada Generation is 249.32x faster

FP32 Compute

0.00x

RTX 6000 Ada Generation is 248.91x faster

VRAM

0.04x

RTX 6000 Ada Generation has 24.00x more

Memory Bandwidth

0.02x

RTX 6000 Ada Generation has 66.67x more

NVIDIA A100 SXM

Compute (FP32-eq)

0.00x

A100 SXM is 426.23x faster

FP32 Compute

0.02x

A100 SXM is 53.28x faster

VRAM

0.03x

A100 SXM has 40.00x more

Memory Bandwidth

0.01x

A100 SXM has 141.60x more

NVIDIA RTX A6000

Compute (FP32-eq)

0.01x

RTX A6000 is 106.42x faster

FP32 Compute

0.01x

RTX A6000 is 105.74x faster

VRAM

0.04x

RTX A6000 has 24.00x more

Memory Bandwidth

0.02x

RTX A6000 has 53.33x more

To match 1x NVIDIA A100 SXM

Groq LPU Inference Engine

VRAM

0.35x

LPU Inference Engine has 2.88x more

NVIDIA RTX 6000 Ada Generation

Compute (FP32-eq)

1.71x

Need 1.71x RTX 6000 Ada Generation

FP32 Compute

0.21x

RTX 6000 Ada Generation is 4.67x faster

VRAM

1.67x

Need 1.67x RTX 6000 Ada Generation

Memory Bandwidth

2.12x

Need 2.12x RTX 6000 Ada Generation

NVIDIA GeForce GT 710

Compute (FP32-eq)

426.23x

Need 426.23x GeForce GT 710

FP32 Compute

53.28x

Need 53.28x GeForce GT 710

VRAM

40.00x

Need 40.00x GeForce GT 710

Memory Bandwidth

141.60x

Need 141.60x GeForce GT 710

NVIDIA RTX A6000

Compute (FP32-eq)

4.01x

Need 4.01x RTX A6000

FP32 Compute

0.50x

RTX A6000 is 1.98x faster

VRAM

1.67x

Need 1.67x RTX A6000

Memory Bandwidth

2.65x

Need 2.65x RTX A6000

To match 1x NVIDIA RTX A6000

Groq LPU Inference Engine

VRAM

0.21x

LPU Inference Engine has 4.79x more

NVIDIA RTX 6000 Ada Generation

Compute (FP32-eq)

0.43x

RTX 6000 Ada Generation is 2.34x faster

FP32 Compute

0.42x

RTX 6000 Ada Generation is 2.35x faster

VRAM

1.00x

RTX 6000 Ada Generation has 1.00x more

Memory Bandwidth

0.80x

RTX 6000 Ada Generation has 1.25x more

NVIDIA GeForce GT 710

Compute (FP32-eq)

106.42x

Need 106.42x GeForce GT 710

FP32 Compute

105.74x

Need 105.74x GeForce GT 710

VRAM

24.00x

Need 24.00x GeForce GT 710

Memory Bandwidth

53.33x

Need 53.33x GeForce GT 710

NVIDIA A100 SXM

Compute (FP32-eq)

0.25x

A100 SXM is 4.01x faster

FP32 Compute

1.98x

Need 1.98x A100 SXM

VRAM

0.60x

A100 SXM has 1.67x more

Memory Bandwidth

0.38x

A100 SXM has 2.65x more

Pricing

Price Type	LPU Inference Engine	RTX 6000 Ada Generation	GeForce GT 710	A100 SXM	RTX A6000
CAPEX (Street Price)	—	—	—	$15,000	—
OPEX (per hour)	—	$0.33/hr	$0.07/hr	$4.05/hr	$0.33/hr
Price per TFLOPs (FP32-eq)	—	—	—	$96	—