Compare XPUs

Select up to 5 XPUs to compare side-by-side

Select XPUs to Compare

Clear all (7)Jump to results

Filter by Vendor

Showing 128 XPUs • 7 selected

Alibaba

Hanguang 800

AMD

MI100

23.1 TFLOPs

AMD

MI210

181 TFLOPs

AMD

MI250X

383 TFLOPs

AMD

MI300X

1,307 TFLOPs

AMD

MI325X

1,400 TFLOPs

AMD

MI350X

2,100 TFLOPs

AMD

MI355X

5,300 TFLOPs

AMD

Radeon Pro V520

23.04 TFLOPs

AWS

Inferentia2

190 TFLOPs

AWS

Trainium

190 TFLOPs

AWS

Trainium2

680 TFLOPs

Baidu

Kunlun II

Biren Technology

BR100

Cambricon

MLU370

256 TFLOPs

Cerebras

WSE-3

Enflame Technology

CloudBlazer T20

FuriosaAI

RNGD (Renegade)

256 TFLOPs

FuriosaAI

Warboy

Google

TPU v4

275 TFLOPs

Google

TPU v5e

197 TFLOPs

Google

TPU v5p

459 TFLOPs

Graphcore

Bow IPU

Graphcore

IPU-M2000

Groq

LPU Inference Engine

Huawei

Ascend 910B

Iluvatar CoreX

BI-V150

300 TFLOPs

Intel

Data Center GPU Max 1100

177 TFLOPs

Intel

Data Center GPU Max 1550

419 TFLOPs

Intel Habana

Gaudi 2

432 TFLOPs

Intel Habana

Gaudi 3

1,835 TFLOPs

Multi-Metric Comparison

Relative performance across 5 key metrics (normalized to 100 = best in comparison)

Compute Performance (BF16)

Memory Capacity

Power Consumption

Power Efficiency

Specifications

Specification	Intel Habana Gaudi 2	AMD MI300X	NVIDIA GeForce GT 710	NVIDIA RTX 5000	NVIDIA RTX A2000	NVIDIA L40	NVIDIA Quadro M4000
Architecture	Gaudi Gen2	CDNA 3	Kepler	Turing	Ampere	Ada Lovelace	Maxwell
Form Factor	OAM	OAM	PCIe	PCIe	PCIe	PCIe	PCIe
VRAM	96 GB	192 GB	2 GB	16 GB	12 GB	48 GB	8 GB
Memory Bandwidth	2,450 GB/s	5,300 GB/s	14.4 GB/s	448 GB/s	288 GB/s	864 GB/s	192 GB/s
TFLOPs (FP32)	—	163.4	0.366	11.2	8	45	2.57
TFLOPs (FP16)	—	1,307	—	—	—	—	—
TFLOPs	432	1,307	0.366	89.2	16	181.05	2.57
TFLOPs (FP8)	—	2,614	—	—	—	—	—
TDP	600 W	750 W	19 W	265 W	70 W	300 W	120 W
Launch Date	May 2022	Dec 2023	Mar 2014	Aug 2018	Oct 2021	Oct 2022	Jun 2015

Efficiency Metrics

Metric	Gaudi 2	MI300X	GeForce GT 710	RTX 5000	RTX A2000	L40	Quadro M4000
TFLOPs per Watt (FP32-eq)	0.36	0.87	0.02	0.17	0.11	0.30	0.02
Memory Bandwidth per GB	25.5 GB/s	27.6 GB/s	7.2 GB/s	28.0 GB/s	24.0 GB/s	18.0 GB/s	24.0 GB/s

Performance Equivalence

How many units of each GPU are needed to match the performance of the others?

To match 1x Intel Habana Gaudi 2

AMD MI300X

Compute (FP32-eq)

0.33x

MI300X is 3.03x faster

VRAM

0.50x

MI300X has 2.00x more

Memory Bandwidth

0.46x

MI300X has 2.16x more

NVIDIA GeForce GT 710

Compute (FP32-eq)

590.16x

Need 590.16x GeForce GT 710

VRAM

48.00x

Need 48.00x GeForce GT 710

Memory Bandwidth

170.14x

Need 170.14x GeForce GT 710

NVIDIA RTX 5000

Compute (FP32-eq)

4.84x

Need 4.84x RTX 5000

VRAM

6.00x

Need 6.00x RTX 5000

Memory Bandwidth

5.47x

Need 5.47x RTX 5000

NVIDIA RTX A2000

Compute (FP32-eq)

27.00x

Need 27.00x RTX A2000

VRAM

8.00x

Need 8.00x RTX A2000

Memory Bandwidth

8.51x

Need 8.51x RTX A2000

NVIDIA L40

Compute (FP32-eq)

2.39x

Need 2.39x L40

VRAM

2.00x

Need 2.00x L40

Memory Bandwidth

2.84x

Need 2.84x L40

NVIDIA Quadro M4000

Compute (FP32-eq)

84.05x

Need 84.05x Quadro M4000

VRAM

12.00x

Need 12.00x Quadro M4000

Memory Bandwidth

12.76x

Need 12.76x Quadro M4000

To match 1x AMD MI300X

Intel Habana Gaudi 2

Compute (FP32-eq)

3.03x

Need 3.03x Gaudi 2

VRAM

2.00x

Need 2.00x Gaudi 2

Memory Bandwidth

2.16x

Need 2.16x Gaudi 2

NVIDIA GeForce GT 710

Compute (FP32-eq)

1785.52x

Need 1785.52x GeForce GT 710

FP32 Compute

446.45x

Need 446.45x GeForce GT 710

VRAM

96.00x

Need 96.00x GeForce GT 710

Memory Bandwidth

368.06x

Need 368.06x GeForce GT 710

NVIDIA RTX 5000

Compute (FP32-eq)

14.65x

Need 14.65x RTX 5000

FP32 Compute

14.59x

Need 14.59x RTX 5000

VRAM

12.00x

Need 12.00x RTX 5000

Memory Bandwidth

11.83x

Need 11.83x RTX 5000

NVIDIA RTX A2000

Compute (FP32-eq)

81.69x

Need 81.69x RTX A2000

FP32 Compute

20.43x

Need 20.43x RTX A2000

VRAM

16.00x

Need 16.00x RTX A2000

Memory Bandwidth

18.40x

Need 18.40x RTX A2000

NVIDIA L40

Compute (FP32-eq)

7.22x

Need 7.22x L40

FP32 Compute

3.63x

Need 3.63x L40

VRAM

4.00x

Need 4.00x L40

Memory Bandwidth

6.13x

Need 6.13x L40

NVIDIA Quadro M4000

Compute (FP32-eq)

254.28x

Need 254.28x Quadro M4000

FP32 Compute

63.58x

Need 63.58x Quadro M4000

VRAM

24.00x

Need 24.00x Quadro M4000

Memory Bandwidth

27.60x

Need 27.60x Quadro M4000

To match 1x NVIDIA GeForce GT 710

Intel Habana Gaudi 2

Compute (FP32-eq)

0.00x

Gaudi 2 is 590.16x faster

VRAM

0.02x

Gaudi 2 has 48.00x more

Memory Bandwidth

0.01x

Gaudi 2 has 170.14x more

AMD MI300X

Compute (FP32-eq)

0.00x

MI300X is 1785.52x faster

FP32 Compute

0.00x

MI300X is 446.45x faster

VRAM

0.01x

MI300X has 96.00x more

Memory Bandwidth

0.00x

MI300X has 368.06x more

NVIDIA RTX 5000

Compute (FP32-eq)

0.01x

RTX 5000 is 121.86x faster

FP32 Compute

0.03x

RTX 5000 is 30.60x faster

VRAM

0.13x

RTX 5000 has 8.00x more

Memory Bandwidth

0.03x

RTX 5000 has 31.11x more

NVIDIA RTX A2000

Compute (FP32-eq)

0.05x

RTX A2000 is 21.86x faster

FP32 Compute

0.05x

RTX A2000 is 21.86x faster

VRAM

0.17x

RTX A2000 has 6.00x more

Memory Bandwidth

0.05x

RTX A2000 has 20.00x more

NVIDIA L40

Compute (FP32-eq)

0.00x

L40 is 247.34x faster

FP32 Compute

0.01x

L40 is 122.95x faster

VRAM

0.04x

L40 has 24.00x more

Memory Bandwidth

0.02x

L40 has 60.00x more

NVIDIA Quadro M4000

Compute (FP32-eq)

0.14x

Quadro M4000 is 7.02x faster

FP32 Compute

0.14x

Quadro M4000 is 7.02x faster

VRAM

0.25x

Quadro M4000 has 4.00x more

Memory Bandwidth

0.07x

Quadro M4000 has 13.33x more

To match 1x NVIDIA RTX 5000

Intel Habana Gaudi 2

Compute (FP32-eq)

0.21x

Gaudi 2 is 4.84x faster

VRAM

0.17x

Gaudi 2 has 6.00x more

Memory Bandwidth

0.18x

Gaudi 2 has 5.47x more

AMD MI300X

Compute (FP32-eq)

0.07x

MI300X is 14.65x faster

FP32 Compute

0.07x

MI300X is 14.59x faster

VRAM

0.08x

MI300X has 12.00x more

Memory Bandwidth

0.08x

MI300X has 11.83x more

NVIDIA GeForce GT 710

Compute (FP32-eq)

121.86x

Need 121.86x GeForce GT 710

FP32 Compute

30.60x

Need 30.60x GeForce GT 710

VRAM

8.00x

Need 8.00x GeForce GT 710

Memory Bandwidth

31.11x

Need 31.11x GeForce GT 710

NVIDIA RTX A2000

Compute (FP32-eq)

5.58x

Need 5.58x RTX A2000

FP32 Compute

1.40x

Need 1.40x RTX A2000

VRAM

1.33x

Need 1.33x RTX A2000

Memory Bandwidth

1.56x

Need 1.56x RTX A2000

NVIDIA L40

Compute (FP32-eq)

0.49x

L40 is 2.03x faster

FP32 Compute

0.25x

L40 is 4.02x faster

VRAM

0.33x

L40 has 3.00x more

Memory Bandwidth

0.52x

L40 has 1.93x more

NVIDIA Quadro M4000

Compute (FP32-eq)

17.35x

Need 17.35x Quadro M4000

FP32 Compute

4.36x

Need 4.36x Quadro M4000

VRAM

2.00x

Need 2.00x Quadro M4000

Memory Bandwidth

2.33x

Need 2.33x Quadro M4000

To match 1x NVIDIA RTX A2000

Intel Habana Gaudi 2

Compute (FP32-eq)

0.04x

Gaudi 2 is 27.00x faster

VRAM

0.13x

Gaudi 2 has 8.00x more

Memory Bandwidth

0.12x

Gaudi 2 has 8.51x more

AMD MI300X

Compute (FP32-eq)

0.01x

MI300X is 81.69x faster

FP32 Compute

0.05x

MI300X is 20.43x faster

VRAM

0.06x

MI300X has 16.00x more

Memory Bandwidth

0.05x

MI300X has 18.40x more

NVIDIA GeForce GT 710

Compute (FP32-eq)

21.86x

Need 21.86x GeForce GT 710

FP32 Compute

21.86x

Need 21.86x GeForce GT 710

VRAM

6.00x

Need 6.00x GeForce GT 710

Memory Bandwidth

20.00x

Need 20.00x GeForce GT 710

NVIDIA RTX 5000

Compute (FP32-eq)

0.18x

RTX 5000 is 5.58x faster

FP32 Compute

0.71x

RTX 5000 is 1.40x faster

VRAM

0.75x

RTX 5000 has 1.33x more

Memory Bandwidth

0.64x

RTX 5000 has 1.56x more

NVIDIA L40

Compute (FP32-eq)

0.09x

L40 is 11.32x faster

FP32 Compute

0.18x

L40 is 5.63x faster

VRAM

0.25x

L40 has 4.00x more

Memory Bandwidth

0.33x

L40 has 3.00x more

NVIDIA Quadro M4000

Compute (FP32-eq)

3.11x

Need 3.11x Quadro M4000

FP32 Compute

3.11x

Need 3.11x Quadro M4000

VRAM

1.50x

Need 1.50x Quadro M4000

Memory Bandwidth

1.50x

Need 1.50x Quadro M4000

To match 1x NVIDIA L40

Intel Habana Gaudi 2

Compute (FP32-eq)

0.42x

Gaudi 2 is 2.39x faster

VRAM

0.50x

Gaudi 2 has 2.00x more

Memory Bandwidth

0.35x

Gaudi 2 has 2.84x more

AMD MI300X

Compute (FP32-eq)

0.14x

MI300X is 7.22x faster

FP32 Compute

0.28x

MI300X is 3.63x faster

VRAM

0.25x

MI300X has 4.00x more

Memory Bandwidth

0.16x

MI300X has 6.13x more

NVIDIA GeForce GT 710

Compute (FP32-eq)

247.34x

Need 247.34x GeForce GT 710

FP32 Compute

122.95x

Need 122.95x GeForce GT 710

VRAM

24.00x

Need 24.00x GeForce GT 710

Memory Bandwidth

60.00x

Need 60.00x GeForce GT 710

NVIDIA RTX 5000

Compute (FP32-eq)

2.03x

Need 2.03x RTX 5000

FP32 Compute

4.02x

Need 4.02x RTX 5000

VRAM

3.00x

Need 3.00x RTX 5000

Memory Bandwidth

1.93x

Need 1.93x RTX 5000

NVIDIA RTX A2000

Compute (FP32-eq)

11.32x

Need 11.32x RTX A2000

FP32 Compute

5.63x

Need 5.63x RTX A2000

VRAM

4.00x

Need 4.00x RTX A2000

Memory Bandwidth

3.00x

Need 3.00x RTX A2000

NVIDIA Quadro M4000

Compute (FP32-eq)

35.22x

Need 35.22x Quadro M4000

FP32 Compute

17.51x

Need 17.51x Quadro M4000

VRAM

6.00x

Need 6.00x Quadro M4000

Memory Bandwidth

4.50x

Need 4.50x Quadro M4000

To match 1x NVIDIA Quadro M4000

Intel Habana Gaudi 2

Compute (FP32-eq)

0.01x

Gaudi 2 is 84.05x faster

VRAM

0.08x

Gaudi 2 has 12.00x more

Memory Bandwidth

0.08x

Gaudi 2 has 12.76x more

AMD MI300X

Compute (FP32-eq)

0.00x

MI300X is 254.28x faster

FP32 Compute

0.02x

MI300X is 63.58x faster

VRAM

0.04x

MI300X has 24.00x more

Memory Bandwidth

0.04x

MI300X has 27.60x more

NVIDIA GeForce GT 710

Compute (FP32-eq)

7.02x

Need 7.02x GeForce GT 710

FP32 Compute

7.02x

Need 7.02x GeForce GT 710

VRAM

4.00x

Need 4.00x GeForce GT 710

Memory Bandwidth

13.33x

Need 13.33x GeForce GT 710

NVIDIA RTX 5000

Compute (FP32-eq)

0.06x

RTX 5000 is 17.35x faster

FP32 Compute

0.23x

RTX 5000 is 4.36x faster

VRAM

0.50x

RTX 5000 has 2.00x more

Memory Bandwidth

0.43x

RTX 5000 has 2.33x more

NVIDIA RTX A2000

Compute (FP32-eq)

0.32x

RTX A2000 is 3.11x faster

FP32 Compute

0.32x

RTX A2000 is 3.11x faster

VRAM

0.67x

RTX A2000 has 1.50x more

Memory Bandwidth

0.67x

RTX A2000 has 1.50x more

NVIDIA L40

Compute (FP32-eq)

0.03x

L40 is 35.22x faster

FP32 Compute

0.06x

L40 is 17.51x faster

VRAM

0.17x

L40 has 6.00x more

Memory Bandwidth

0.22x

L40 has 4.50x more

Pricing

Price Type	Gaudi 2	MI300X	GeForce GT 710	RTX 5000	RTX A2000	L40	Quadro M4000
CAPEX (Street Price)	—	$35,000	—	—	—	—	—
OPEX (per hour)	—	$10.40/hr	$0.07/hr	$0.82/hr	$0.04/hr	$0.69/hr	$0.45/hr
Price per TFLOPs (FP32-eq)	—	$54	—	—	—	—	—