Compare XPUs

Select up to 5 XPUs to compare side-by-side

Select XPUs to Compare

Clear all (5)Jump to results

Filter by Vendor

Showing 128 XPUs • 5 selected

Alibaba

Hanguang 800

AMD

MI100

23.1 TFLOPs

AMD

MI210

181 TFLOPs

AMD

MI250X

383 TFLOPs

AMD

MI300X

1,307 TFLOPs

AMD

MI325X

1,400 TFLOPs

AMD

MI350X

2,100 TFLOPs

AMD

MI355X

5,300 TFLOPs

AMD

Radeon Pro V520

23.04 TFLOPs

AWS

Inferentia2

190 TFLOPs

AWS

Trainium

190 TFLOPs

AWS

Trainium2

680 TFLOPs

Baidu

Kunlun II

Biren Technology

BR100

Cambricon

MLU370

256 TFLOPs

Cerebras

WSE-3

Enflame Technology

CloudBlazer T20

FuriosaAI

RNGD (Renegade)

256 TFLOPs

FuriosaAI

Warboy

Google

TPU v4

275 TFLOPs

Google

TPU v5e

197 TFLOPs

Google

TPU v5p

459 TFLOPs

Graphcore

Bow IPU

Graphcore

IPU-M2000

Groq

LPU Inference Engine

Huawei

Ascend 910B

Iluvatar CoreX

BI-V150

300 TFLOPs

Intel

Data Center GPU Max 1100

177 TFLOPs

Intel

Data Center GPU Max 1550

419 TFLOPs

Intel Habana

Gaudi 2

432 TFLOPs

Intel Habana

Gaudi 3

1,835 TFLOPs

Multi-Metric Comparison

Relative performance across 5 key metrics (normalized to 100 = best in comparison)

Compute Performance (BF16)

Memory Capacity

Power Consumption

Power Efficiency

Specifications

Specification	NVIDIA L4	AMD MI250X	NVIDIA P40	NVIDIA GeForce GT 710	NVIDIA P100
Architecture	Ada Lovelace	CDNA 2	Pascal	Kepler	Pascal
Form Factor	PCIe	OAM	PCIe	PCIe	SXM
VRAM	24 GB	128 GB	24 GB	2 GB	16 GB
Memory Bandwidth	300 GB/s	3,277 GB/s	346 GB/s	14.4 GB/s	732 GB/s
TFLOPs (FP32)	30.3	47.9	12	0.366	9.3
TFLOPs (FP16)	242	383	—	—	—
TFLOPs	121	383	12	0.366	9.3
TFLOPs (FP8)	485	—	—	—	—
TDP	72 W	560 W	250 W	19 W	300 W
Launch Date	Mar 2023	Nov 2021	Sep 2016	Mar 2014	Apr 2016

Efficiency Metrics

Metric	L4	MI250X	P40	GeForce GT 710	P100
TFLOPs per Watt (FP32-eq)	0.84	0.34	0.05	0.02	0.03
Memory Bandwidth per GB	12.5 GB/s	25.6 GB/s	14.4 GB/s	7.2 GB/s	45.8 GB/s

Performance Equivalence

How many units of each GPU are needed to match the performance of the others?

To match 1x NVIDIA L4

AMD MI250X

Compute (FP32-eq)

0.32x

MI250X is 3.17x faster

FP32 Compute

0.63x

MI250X is 1.58x faster

VRAM

0.19x

MI250X has 5.33x more

Memory Bandwidth

0.09x

MI250X has 10.92x more

NVIDIA P40

Compute (FP32-eq)

5.04x

Need 5.04x P40

FP32 Compute

2.52x

Need 2.52x P40

VRAM

1.00x

P40 has 1.00x more

Memory Bandwidth

0.87x

P40 has 1.15x more

NVIDIA GeForce GT 710

Compute (FP32-eq)

165.30x

Need 165.30x GeForce GT 710

FP32 Compute

82.79x

Need 82.79x GeForce GT 710

VRAM

12.00x

Need 12.00x GeForce GT 710

Memory Bandwidth

20.83x

Need 20.83x GeForce GT 710

NVIDIA P100

Compute (FP32-eq)

6.51x

Need 6.51x P100

FP32 Compute

3.26x

Need 3.26x P100

VRAM

1.50x

Need 1.50x P100

Memory Bandwidth

0.41x

P100 has 2.44x more

To match 1x AMD MI250X

NVIDIA L4

Compute (FP32-eq)

3.17x

Need 3.17x L4

FP32 Compute

1.58x

Need 1.58x L4

VRAM

5.33x

Need 5.33x L4

Memory Bandwidth

10.92x

Need 10.92x L4

NVIDIA P40

Compute (FP32-eq)

15.96x

Need 15.96x P40

FP32 Compute

3.99x

Need 3.99x P40

VRAM

5.33x

Need 5.33x P40

Memory Bandwidth

9.47x

Need 9.47x P40

NVIDIA GeForce GT 710

Compute (FP32-eq)

523.22x

Need 523.22x GeForce GT 710

FP32 Compute

130.87x

Need 130.87x GeForce GT 710

VRAM

64.00x

Need 64.00x GeForce GT 710

Memory Bandwidth

227.57x

Need 227.57x GeForce GT 710

NVIDIA P100

Compute (FP32-eq)

20.59x

Need 20.59x P100

FP32 Compute

5.15x

Need 5.15x P100

VRAM

8.00x

Need 8.00x P100

Memory Bandwidth

4.48x

Need 4.48x P100

To match 1x NVIDIA P40

NVIDIA L4

Compute (FP32-eq)

0.20x

L4 is 5.04x faster

FP32 Compute

0.40x

L4 is 2.53x faster

VRAM

1.00x

L4 has 1.00x more

Memory Bandwidth

1.15x

Need 1.15x L4

AMD MI250X

Compute (FP32-eq)

0.06x

MI250X is 15.96x faster

FP32 Compute

0.25x

MI250X is 3.99x faster

VRAM

0.19x

MI250X has 5.33x more

Memory Bandwidth

0.11x

MI250X has 9.47x more

NVIDIA GeForce GT 710

Compute (FP32-eq)

32.79x

Need 32.79x GeForce GT 710

FP32 Compute

32.79x

Need 32.79x GeForce GT 710

VRAM

12.00x

Need 12.00x GeForce GT 710

Memory Bandwidth

24.03x

Need 24.03x GeForce GT 710

NVIDIA P100

Compute (FP32-eq)

1.29x

Need 1.29x P100

FP32 Compute

1.29x

Need 1.29x P100

VRAM

1.50x

Need 1.50x P100

Memory Bandwidth

0.47x

P100 has 2.12x more

To match 1x NVIDIA GeForce GT 710

NVIDIA L4

Compute (FP32-eq)

0.01x

L4 is 165.30x faster

FP32 Compute

0.01x

L4 is 82.79x faster

VRAM

0.08x

L4 has 12.00x more

Memory Bandwidth

0.05x

L4 has 20.83x more

AMD MI250X

Compute (FP32-eq)

0.00x

MI250X is 523.22x faster

FP32 Compute

0.01x

MI250X is 130.87x faster

VRAM

0.02x

MI250X has 64.00x more

Memory Bandwidth

0.00x

MI250X has 227.57x more

NVIDIA P40

Compute (FP32-eq)

0.03x

P40 is 32.79x faster

FP32 Compute

0.03x

P40 is 32.79x faster

VRAM

0.08x

P40 has 12.00x more

Memory Bandwidth

0.04x

P40 has 24.03x more

NVIDIA P100

Compute (FP32-eq)

0.04x

P100 is 25.41x faster

FP32 Compute

0.04x

P100 is 25.41x faster

VRAM

0.13x

P100 has 8.00x more

Memory Bandwidth

0.02x

P100 has 50.83x more

To match 1x NVIDIA P100

NVIDIA L4

Compute (FP32-eq)

0.15x

L4 is 6.51x faster

FP32 Compute

0.31x

L4 is 3.26x faster

VRAM

0.67x

L4 has 1.50x more

Memory Bandwidth

2.44x

Need 2.44x L4

AMD MI250X

Compute (FP32-eq)

0.05x

MI250X is 20.59x faster

FP32 Compute

0.19x

MI250X is 5.15x faster

VRAM

0.13x

MI250X has 8.00x more

Memory Bandwidth

0.22x

MI250X has 4.48x more

NVIDIA P40

Compute (FP32-eq)

0.78x

P40 is 1.29x faster

FP32 Compute

0.78x

P40 is 1.29x faster

VRAM

0.67x

P40 has 1.50x more

Memory Bandwidth

2.12x

Need 2.12x P40

NVIDIA GeForce GT 710

Compute (FP32-eq)

25.41x

Need 25.41x GeForce GT 710

FP32 Compute

25.41x

Need 25.41x GeForce GT 710

VRAM

8.00x

Need 8.00x GeForce GT 710

Memory Bandwidth

50.83x

Need 50.83x GeForce GT 710

Pricing

Price Type	L4	MI250X	P40	GeForce GT 710	P100
CAPEX (Street Price)	$4,000	$12,000	—	—	—
OPEX (per hour)	$0.80/hr	$2.00/hr	$2.07/hr	$0.07/hr	$0.28/hr
Price per TFLOPs (FP32-eq)	$66	$63	—	—	—