Compare XPUs

Select up to 5 XPUs to compare side-by-side

Select XPUs to Compare

Clear all (6)Jump to results

Filter by Vendor

Showing 128 XPUs • 6 selected

Alibaba

Hanguang 800

AMD

MI100

23.1 TFLOPs

AMD

MI210

181 TFLOPs

AMD

MI250X

383 TFLOPs

AMD

MI300X

1,307 TFLOPs

AMD

MI325X

1,400 TFLOPs

AMD

MI350X

2,100 TFLOPs

AMD

MI355X

5,300 TFLOPs

AMD

Radeon Pro V520

23.04 TFLOPs

AWS

Inferentia2

190 TFLOPs

AWS

Trainium

190 TFLOPs

AWS

Trainium2

680 TFLOPs

Baidu

Kunlun II

Biren Technology

BR100

Cambricon

MLU370

256 TFLOPs

Cerebras

WSE-3

Enflame Technology

CloudBlazer T20

FuriosaAI

RNGD (Renegade)

256 TFLOPs

FuriosaAI

Warboy

Google

TPU v4

275 TFLOPs

Google

TPU v5e

197 TFLOPs

Google

TPU v5p

459 TFLOPs

Graphcore

Bow IPU

Graphcore

IPU-M2000

Groq

LPU Inference Engine

Huawei

Ascend 910B

Iluvatar CoreX

BI-V150

300 TFLOPs

Intel

Data Center GPU Max 1100

177 TFLOPs

Intel

Data Center GPU Max 1550

419 TFLOPs

Intel Habana

Gaudi 2

432 TFLOPs

Intel Habana

Gaudi 3

1,835 TFLOPs

Multi-Metric Comparison

Relative performance across 5 key metrics (normalized to 100 = best in comparison)

Compute Performance (BF16)

Memory Capacity

Power Consumption

Power Efficiency

Specifications

Specification	Groq LPU Inference Engine	AMD MI210	Intel Data Center GPU Max 1100	NVIDIA GeForce GT 710	NVIDIA GeForce GTX 1660	NVIDIA A100 SXM
Architecture	TSP (Tensor Streaming Processor)	CDNA 2	Ponte Vecchio	Kepler	Turing	Ampere
Form Factor	—	PCIe	OAM	PCIe	PCIe	SXM
VRAM	230 GB	64 GB	48 GB	2 GB	6 GB	80 GB
Memory Bandwidth	—	1,638 GB/s	1,229 GB/s	14.4 GB/s	192 GB/s	2,039 GB/s
TFLOPs (FP32)	—	45.3	22	0.366	5.027	19.5
TFLOPs (FP16)	—	181	177	—	—	312
TFLOPs	—	181	177	0.366	5.027	312
TFLOPs (FP8)	—	—	—	—	—	—
TDP	300 W	300 W	300 W	19 W	120 W	400 W
Launch Date	Feb 2024	Jan 2022	Jan 2023	Mar 2014	Mar 2019	May 2020

Efficiency Metrics

Metric	LPU Inference Engine	MI210	Data Center GPU Max 1100	GeForce GT 710	GeForce GTX 1660	A100 SXM
TFLOPs per Watt (FP32-eq)	—	0.30	0.29	0.02	0.04	0.39
Memory Bandwidth per GB	—	25.6 GB/s	25.6 GB/s	7.2 GB/s	32.0 GB/s	25.5 GB/s

Performance Equivalence

How many units of each GPU are needed to match the performance of the others?

To match 1x Groq LPU Inference Engine

AMD MI210

VRAM

3.59x

Need 3.59x MI210

Intel Data Center GPU Max 1100

VRAM

4.79x

Need 4.79x Data Center GPU Max 1100

NVIDIA GeForce GT 710

VRAM

115.00x

Need 115.00x GeForce GT 710

NVIDIA GeForce GTX 1660

VRAM

38.33x

Need 38.33x GeForce GTX 1660

NVIDIA A100 SXM

VRAM

2.88x

Need 2.88x A100 SXM

To match 1x AMD MI210

Groq LPU Inference Engine

VRAM

0.28x

LPU Inference Engine has 3.59x more

Intel Data Center GPU Max 1100

Compute (FP32-eq)

1.02x

Need 1.02x Data Center GPU Max 1100

FP32 Compute

2.06x

Need 2.06x Data Center GPU Max 1100

VRAM

1.33x

Need 1.33x Data Center GPU Max 1100

Memory Bandwidth

1.33x

Need 1.33x Data Center GPU Max 1100

NVIDIA GeForce GT 710

Compute (FP32-eq)

247.27x

Need 247.27x GeForce GT 710

FP32 Compute

123.77x

Need 123.77x GeForce GT 710

VRAM

32.00x

Need 32.00x GeForce GT 710

Memory Bandwidth

113.75x

Need 113.75x GeForce GT 710

NVIDIA GeForce GTX 1660

Compute (FP32-eq)

18.00x

Need 18.00x GeForce GTX 1660

FP32 Compute

9.01x

Need 9.01x GeForce GTX 1660

VRAM

10.67x

Need 10.67x GeForce GTX 1660

Memory Bandwidth

8.53x

Need 8.53x GeForce GTX 1660

NVIDIA A100 SXM

Compute (FP32-eq)

0.58x

A100 SXM is 1.72x faster

FP32 Compute

2.32x

Need 2.32x A100 SXM

VRAM

0.80x

A100 SXM has 1.25x more

Memory Bandwidth

0.80x

A100 SXM has 1.24x more

To match 1x Intel Data Center GPU Max 1100

Groq LPU Inference Engine

VRAM

0.21x

LPU Inference Engine has 4.79x more

AMD MI210

Compute (FP32-eq)

0.98x

MI210 is 1.02x faster

FP32 Compute

0.49x

MI210 is 2.06x faster

VRAM

0.75x

MI210 has 1.33x more

Memory Bandwidth

0.75x

MI210 has 1.33x more

NVIDIA GeForce GT 710

Compute (FP32-eq)

241.80x

Need 241.80x GeForce GT 710

FP32 Compute

60.11x

Need 60.11x GeForce GT 710

VRAM

24.00x

Need 24.00x GeForce GT 710

Memory Bandwidth

85.35x

Need 85.35x GeForce GT 710

NVIDIA GeForce GTX 1660

Compute (FP32-eq)

17.60x

Need 17.60x GeForce GTX 1660

FP32 Compute

4.38x

Need 4.38x GeForce GTX 1660

VRAM

8.00x

Need 8.00x GeForce GTX 1660

Memory Bandwidth

6.40x

Need 6.40x GeForce GTX 1660

NVIDIA A100 SXM

Compute (FP32-eq)

0.57x

A100 SXM is 1.76x faster

FP32 Compute

1.13x

Need 1.13x A100 SXM

VRAM

0.60x

A100 SXM has 1.67x more

Memory Bandwidth

0.60x

A100 SXM has 1.66x more

To match 1x NVIDIA GeForce GT 710

Groq LPU Inference Engine

VRAM

0.01x

LPU Inference Engine has 115.00x more

AMD MI210

Compute (FP32-eq)

0.00x

MI210 is 247.27x faster

FP32 Compute

0.01x

MI210 is 123.77x faster

VRAM

0.03x

MI210 has 32.00x more

Memory Bandwidth

0.01x

MI210 has 113.75x more

Intel Data Center GPU Max 1100

Compute (FP32-eq)

0.00x

Data Center GPU Max 1100 is 241.80x faster

FP32 Compute

0.02x

Data Center GPU Max 1100 is 60.11x faster

VRAM

0.04x

Data Center GPU Max 1100 has 24.00x more

Memory Bandwidth

0.01x

Data Center GPU Max 1100 has 85.35x more

NVIDIA GeForce GTX 1660

Compute (FP32-eq)

0.07x

GeForce GTX 1660 is 13.73x faster

FP32 Compute

0.07x

GeForce GTX 1660 is 13.73x faster

VRAM

0.33x

GeForce GTX 1660 has 3.00x more

Memory Bandwidth

0.07x

GeForce GTX 1660 has 13.33x more

NVIDIA A100 SXM

Compute (FP32-eq)

0.00x

A100 SXM is 426.23x faster

FP32 Compute

0.02x

A100 SXM is 53.28x faster

VRAM

0.03x

A100 SXM has 40.00x more

Memory Bandwidth

0.01x

A100 SXM has 141.60x more

To match 1x NVIDIA GeForce GTX 1660

Groq LPU Inference Engine

VRAM

0.03x

LPU Inference Engine has 38.33x more

AMD MI210

Compute (FP32-eq)

0.06x

MI210 is 18.00x faster

FP32 Compute

0.11x

MI210 is 9.01x faster

VRAM

0.09x

MI210 has 10.67x more

Memory Bandwidth

0.12x

MI210 has 8.53x more

Intel Data Center GPU Max 1100

Compute (FP32-eq)

0.06x

Data Center GPU Max 1100 is 17.60x faster

FP32 Compute

0.23x

Data Center GPU Max 1100 is 4.38x faster

VRAM

0.13x

Data Center GPU Max 1100 has 8.00x more

Memory Bandwidth

0.16x

Data Center GPU Max 1100 has 6.40x more

NVIDIA GeForce GT 710

Compute (FP32-eq)

13.73x

Need 13.73x GeForce GT 710

FP32 Compute

13.73x

Need 13.73x GeForce GT 710

VRAM

3.00x

Need 3.00x GeForce GT 710

Memory Bandwidth

13.33x

Need 13.33x GeForce GT 710

NVIDIA A100 SXM

Compute (FP32-eq)

0.03x

A100 SXM is 31.03x faster

FP32 Compute

0.26x

A100 SXM is 3.88x faster

VRAM

0.07x

A100 SXM has 13.33x more

Memory Bandwidth

0.09x

A100 SXM has 10.62x more

To match 1x NVIDIA A100 SXM

Groq LPU Inference Engine

VRAM

0.35x

LPU Inference Engine has 2.88x more

AMD MI210

Compute (FP32-eq)

1.72x

Need 1.72x MI210

FP32 Compute

0.43x

MI210 is 2.32x faster

VRAM

1.25x

Need 1.25x MI210

Memory Bandwidth

1.24x

Need 1.24x MI210

Intel Data Center GPU Max 1100

Compute (FP32-eq)

1.76x

Need 1.76x Data Center GPU Max 1100

FP32 Compute

0.89x

Data Center GPU Max 1100 is 1.13x faster

VRAM

1.67x

Need 1.67x Data Center GPU Max 1100

Memory Bandwidth

1.66x

Need 1.66x Data Center GPU Max 1100

NVIDIA GeForce GT 710

Compute (FP32-eq)

426.23x

Need 426.23x GeForce GT 710

FP32 Compute

53.28x

Need 53.28x GeForce GT 710

VRAM

40.00x

Need 40.00x GeForce GT 710

Memory Bandwidth

141.60x

Need 141.60x GeForce GT 710

NVIDIA GeForce GTX 1660

Compute (FP32-eq)

31.03x

Need 31.03x GeForce GTX 1660

FP32 Compute

3.88x

Need 3.88x GeForce GTX 1660

VRAM

13.33x

Need 13.33x GeForce GTX 1660

Memory Bandwidth

10.62x

Need 10.62x GeForce GTX 1660

Pricing

Price Type	LPU Inference Engine	MI210	Data Center GPU Max 1100	GeForce GT 710	GeForce GTX 1660	A100 SXM
CAPEX (Street Price)	—	$6,000	$5,000	—	—	$15,000
OPEX (per hour)	—	—	—	$0.07/hr	$0.04/hr	$4.05/hr
Price per TFLOPs (FP32-eq)	—	$66	$56	—	—	$96