Compare XPUs

Select up to 5 XPUs to compare side-by-side

Select XPUs to Compare

Clear all (5)Jump to results

Filter by Vendor

Showing 128 XPUs • 5 selected

Alibaba

Hanguang 800

AMD

MI100

23.1 TFLOPs

AMD

MI210

181 TFLOPs

AMD

MI250X

383 TFLOPs

AMD

MI300X

1,307 TFLOPs

AMD

MI325X

1,400 TFLOPs

AMD

MI350X

2,100 TFLOPs

AMD

MI355X

5,300 TFLOPs

AMD

Radeon Pro V520

23.04 TFLOPs

AWS

Inferentia2

190 TFLOPs

AWS

Trainium

190 TFLOPs

AWS

Trainium2

680 TFLOPs

Baidu

Kunlun II

Biren Technology

BR100

Cambricon

MLU370

256 TFLOPs

Cerebras

WSE-3

Enflame Technology

CloudBlazer T20

FuriosaAI

RNGD (Renegade)

256 TFLOPs

FuriosaAI

Warboy

Google

TPU v4

275 TFLOPs

Google

TPU v5e

197 TFLOPs

Google

TPU v5p

459 TFLOPs

Graphcore

Bow IPU

Graphcore

IPU-M2000

Groq

LPU Inference Engine

Huawei

Ascend 910B

Iluvatar CoreX

BI-V150

300 TFLOPs

Intel

Data Center GPU Max 1100

177 TFLOPs

Intel

Data Center GPU Max 1550

419 TFLOPs

Intel Habana

Gaudi 2

432 TFLOPs

Intel Habana

Gaudi 3

1,835 TFLOPs

Multi-Metric Comparison

Relative performance across 5 key metrics (normalized to 100 = best in comparison)

Compute Performance (BF16)

Memory Capacity

Power Consumption

Power Efficiency

Specifications

Specification	Intel Habana Gaudi 3	NVIDIA RTX 4500 Ada Generation	NVIDIA GeForce RTX 5060 Ti	NVIDIA GeForce GT 710	NVIDIA RTX 4000
Architecture	Gaudi Gen3	Ada Lovelace	Blackwell	Kepler	Turing
Form Factor	OAM	PCIe	PCIe	PCIe	PCIe
VRAM	128 GB	24 GB	16 GB	2 GB	8 GB
Memory Bandwidth	3,700 GB/s	576 GB/s	544 GB/s	14.4 GB/s	416 GB/s
TFLOPs (FP32)	—	48.5	25	0.366	7.1
TFLOPs (FP16)	—	—	—	—	—
TFLOPs	1,835	97	88	0.366	57.6
TFLOPs (FP8)	3,670	—	—	—	—
TDP	900 W	210 W	220 W	19 W	160 W
Launch Date	Apr 2024	Mar 2023	Mar 2025	Mar 2014	Nov 2018

Efficiency Metrics

Metric	Gaudi 3	RTX 4500 Ada Generation	GeForce RTX 5060 Ti	GeForce GT 710	RTX 4000
TFLOPs per Watt (FP32-eq)	1.02	0.23	0.20	0.02	0.18
Memory Bandwidth per GB	28.9 GB/s	24.0 GB/s	34.0 GB/s	7.2 GB/s	52.0 GB/s

Performance Equivalence

How many units of each GPU are needed to match the performance of the others?

To match 1x Intel Habana Gaudi 3

NVIDIA RTX 4500 Ada Generation

Compute (FP32-eq)

18.92x

Need 18.92x RTX 4500 Ada Generation

VRAM

5.33x

Need 5.33x RTX 4500 Ada Generation

Memory Bandwidth

6.42x

Need 6.42x RTX 4500 Ada Generation

NVIDIA GeForce RTX 5060 Ti

Compute (FP32-eq)

20.85x

Need 20.85x GeForce RTX 5060 Ti

VRAM

8.00x

Need 8.00x GeForce RTX 5060 Ti

Memory Bandwidth

6.80x

Need 6.80x GeForce RTX 5060 Ti

NVIDIA GeForce GT 710

Compute (FP32-eq)

2506.83x

Need 2506.83x GeForce GT 710

VRAM

64.00x

Need 64.00x GeForce GT 710

Memory Bandwidth

256.94x

Need 256.94x GeForce GT 710

NVIDIA RTX 4000

Compute (FP32-eq)

31.86x

Need 31.86x RTX 4000

VRAM

16.00x

Need 16.00x RTX 4000

Memory Bandwidth

8.89x

Need 8.89x RTX 4000

To match 1x NVIDIA RTX 4500 Ada Generation

Intel Habana Gaudi 3

Compute (FP32-eq)

0.05x

Gaudi 3 is 18.92x faster

VRAM

0.19x

Gaudi 3 has 5.33x more

Memory Bandwidth

0.16x

Gaudi 3 has 6.42x more

NVIDIA GeForce RTX 5060 Ti

Compute (FP32-eq)

1.10x

Need 1.10x GeForce RTX 5060 Ti

FP32 Compute

1.94x

Need 1.94x GeForce RTX 5060 Ti

VRAM

1.50x

Need 1.50x GeForce RTX 5060 Ti

Memory Bandwidth

1.06x

Need 1.06x GeForce RTX 5060 Ti

NVIDIA GeForce GT 710

Compute (FP32-eq)

132.51x

Need 132.51x GeForce GT 710

FP32 Compute

132.51x

Need 132.51x GeForce GT 710

VRAM

12.00x

Need 12.00x GeForce GT 710

Memory Bandwidth

40.00x

Need 40.00x GeForce GT 710

NVIDIA RTX 4000

Compute (FP32-eq)

1.68x

Need 1.68x RTX 4000

FP32 Compute

6.83x

Need 6.83x RTX 4000

VRAM

3.00x

Need 3.00x RTX 4000

Memory Bandwidth

1.38x

Need 1.38x RTX 4000

To match 1x NVIDIA GeForce RTX 5060 Ti

Intel Habana Gaudi 3

Compute (FP32-eq)

0.05x

Gaudi 3 is 20.85x faster

VRAM

0.13x

Gaudi 3 has 8.00x more

Memory Bandwidth

0.15x

Gaudi 3 has 6.80x more

NVIDIA RTX 4500 Ada Generation

Compute (FP32-eq)

0.91x

RTX 4500 Ada Generation is 1.10x faster

FP32 Compute

0.52x

RTX 4500 Ada Generation is 1.94x faster

VRAM

0.67x

RTX 4500 Ada Generation has 1.50x more

Memory Bandwidth

0.94x

RTX 4500 Ada Generation has 1.06x more

NVIDIA GeForce GT 710

Compute (FP32-eq)

120.22x

Need 120.22x GeForce GT 710

FP32 Compute

68.31x

Need 68.31x GeForce GT 710

VRAM

8.00x

Need 8.00x GeForce GT 710

Memory Bandwidth

37.78x

Need 37.78x GeForce GT 710

NVIDIA RTX 4000

Compute (FP32-eq)

1.53x

Need 1.53x RTX 4000

FP32 Compute

3.52x

Need 3.52x RTX 4000

VRAM

2.00x

Need 2.00x RTX 4000

Memory Bandwidth

1.31x

Need 1.31x RTX 4000

To match 1x NVIDIA GeForce GT 710

Intel Habana Gaudi 3

Compute (FP32-eq)

0.00x

Gaudi 3 is 2506.83x faster

VRAM

0.02x

Gaudi 3 has 64.00x more

Memory Bandwidth

0.00x

Gaudi 3 has 256.94x more

NVIDIA RTX 4500 Ada Generation

Compute (FP32-eq)

0.01x

RTX 4500 Ada Generation is 132.51x faster

FP32 Compute

0.01x

RTX 4500 Ada Generation is 132.51x faster

VRAM

0.08x

RTX 4500 Ada Generation has 12.00x more

Memory Bandwidth

0.03x

RTX 4500 Ada Generation has 40.00x more

NVIDIA GeForce RTX 5060 Ti

Compute (FP32-eq)

0.01x

GeForce RTX 5060 Ti is 120.22x faster

FP32 Compute

0.01x

GeForce RTX 5060 Ti is 68.31x faster

VRAM

0.13x

GeForce RTX 5060 Ti has 8.00x more

Memory Bandwidth

0.03x

GeForce RTX 5060 Ti has 37.78x more

NVIDIA RTX 4000

Compute (FP32-eq)

0.01x

RTX 4000 is 78.69x faster

FP32 Compute

0.05x

RTX 4000 is 19.40x faster

VRAM

0.25x

RTX 4000 has 4.00x more

Memory Bandwidth

0.03x

RTX 4000 has 28.89x more

To match 1x NVIDIA RTX 4000

Intel Habana Gaudi 3

Compute (FP32-eq)

0.03x

Gaudi 3 is 31.86x faster

VRAM

0.06x

Gaudi 3 has 16.00x more

Memory Bandwidth

0.11x

Gaudi 3 has 8.89x more

NVIDIA RTX 4500 Ada Generation

Compute (FP32-eq)

0.59x

RTX 4500 Ada Generation is 1.68x faster

FP32 Compute

0.15x

RTX 4500 Ada Generation is 6.83x faster

VRAM

0.33x

RTX 4500 Ada Generation has 3.00x more

Memory Bandwidth

0.72x

RTX 4500 Ada Generation has 1.38x more

NVIDIA GeForce RTX 5060 Ti

Compute (FP32-eq)

0.65x

GeForce RTX 5060 Ti is 1.53x faster

FP32 Compute

0.28x

GeForce RTX 5060 Ti is 3.52x faster

VRAM

0.50x

GeForce RTX 5060 Ti has 2.00x more

Memory Bandwidth

0.76x

GeForce RTX 5060 Ti has 1.31x more

NVIDIA GeForce GT 710

Compute (FP32-eq)

78.69x

Need 78.69x GeForce GT 710

FP32 Compute

19.40x

Need 19.40x GeForce GT 710

VRAM

4.00x

Need 4.00x GeForce GT 710

Memory Bandwidth

28.89x

Need 28.89x GeForce GT 710

Pricing

Price Type	Gaudi 3	RTX 4500 Ada Generation	GeForce RTX 5060 Ti	GeForce GT 710	RTX 4000
CAPEX (Street Price)	$15,000	—	—	—	—
OPEX (per hour)	$1.20/hr	—	$0.09/hr	$0.07/hr	$0.34/hr
Price per TFLOPs (FP32-eq)	$16	—	—	—	—