Compare XPUs

Select up to 5 XPUs to compare side-by-side

Select XPUs to Compare

Clear all (7)Jump to results

Filter by Vendor

Showing 128 XPUs • 7 selected

Alibaba

Hanguang 800

AMD

MI100

23.1 TFLOPs

AMD

MI210

181 TFLOPs

AMD

MI250X

383 TFLOPs

AMD

MI300X

1,307 TFLOPs

AMD

MI325X

1,400 TFLOPs

AMD

MI350X

2,100 TFLOPs

AMD

MI355X

5,300 TFLOPs

AMD

Radeon Pro V520

23.04 TFLOPs

AWS

Inferentia2

190 TFLOPs

AWS

Trainium

190 TFLOPs

AWS

Trainium2

680 TFLOPs

Baidu

Kunlun II

Biren Technology

BR100

Cambricon

MLU370

256 TFLOPs

Cerebras

WSE-3

Enflame Technology

CloudBlazer T20

FuriosaAI

RNGD (Renegade)

256 TFLOPs

FuriosaAI

Warboy

Google

TPU v4

275 TFLOPs

Google

TPU v5e

197 TFLOPs

Google

TPU v5p

459 TFLOPs

Graphcore

Bow IPU

Graphcore

IPU-M2000

Groq

LPU Inference Engine

Huawei

Ascend 910B

Iluvatar CoreX

BI-V150

300 TFLOPs

Intel

Data Center GPU Max 1100

177 TFLOPs

Intel

Data Center GPU Max 1550

419 TFLOPs

Intel Habana

Gaudi 2

432 TFLOPs

Intel Habana

Gaudi 3

1,835 TFLOPs

Multi-Metric Comparison

Relative performance across 5 key metrics (normalized to 100 = best in comparison)

Compute Performance (BF16)

Memory Capacity

Power Consumption

Power Efficiency

Specifications

Specification	Intel Habana Gaudi 3	NVIDIA GeForce GT 710	NVIDIA Quadro RTX 6000	NVIDIA T1000	NVIDIA GeForce RTX 2070	NVIDIA GeForce RTX 3090	NVIDIA P100
Architecture	Gaudi Gen3	Kepler	Turing	Turing	Turing	Ampere	Pascal
Form Factor	OAM	PCIe	PCIe	PCIe	PCIe	PCIe	SXM
VRAM	128 GB	2 GB	24 GB	8 GB	8 GB	24 GB	16 GB
Memory Bandwidth	3,700 GB/s	14.4 GB/s	672 GB/s	160 GB/s	448 GB/s	936 GB/s	732 GB/s
TFLOPs (FP32)	—	0.366	16.3	2.5	7.5	35.6	9.3
TFLOPs (FP16)	—	—	—	—	—	—	—
TFLOPs	1,835	0.366	130.5	2.5	7.5	71	9.3
TFLOPs (FP8)	3,670	—	—	—	—	—	—
TDP	900 W	19 W	295 W	50 W	175 W	350 W	300 W
Launch Date	Apr 2024	Mar 2014	Aug 2018	May 2019	Oct 2018	Sep 2020	Apr 2016

Efficiency Metrics

Metric	Gaudi 3	GeForce GT 710	Quadro RTX 6000	T1000	GeForce RTX 2070	GeForce RTX 3090	P100
TFLOPs per Watt (FP32-eq)	1.02	0.02	0.22	0.05	0.04	0.10	0.03
Memory Bandwidth per GB	28.9 GB/s	7.2 GB/s	28.0 GB/s	20.0 GB/s	56.0 GB/s	39.0 GB/s	45.8 GB/s

Performance Equivalence

How many units of each GPU are needed to match the performance of the others?

To match 1x Intel Habana Gaudi 3

NVIDIA GeForce GT 710

Compute (FP32-eq)

2506.83x

Need 2506.83x GeForce GT 710

VRAM

64.00x

Need 64.00x GeForce GT 710

Memory Bandwidth

256.94x

Need 256.94x GeForce GT 710

NVIDIA Quadro RTX 6000

Compute (FP32-eq)

14.06x

Need 14.06x Quadro RTX 6000

VRAM

5.33x

Need 5.33x Quadro RTX 6000

Memory Bandwidth

5.51x

Need 5.51x Quadro RTX 6000

NVIDIA T1000

Compute (FP32-eq)

367.00x

Need 367.00x T1000

VRAM

16.00x

Need 16.00x T1000

Memory Bandwidth

23.13x

Need 23.13x T1000

NVIDIA GeForce RTX 2070

Compute (FP32-eq)

122.33x

Need 122.33x GeForce RTX 2070

VRAM

16.00x

Need 16.00x GeForce RTX 2070

Memory Bandwidth

8.26x

Need 8.26x GeForce RTX 2070

NVIDIA GeForce RTX 3090

Compute (FP32-eq)

25.85x

Need 25.85x GeForce RTX 3090

VRAM

5.33x

Need 5.33x GeForce RTX 3090

Memory Bandwidth

3.95x

Need 3.95x GeForce RTX 3090

NVIDIA P100

Compute (FP32-eq)

98.66x

Need 98.66x P100

VRAM

8.00x

Need 8.00x P100

Memory Bandwidth

5.05x

Need 5.05x P100

To match 1x NVIDIA GeForce GT 710

Intel Habana Gaudi 3

Compute (FP32-eq)

0.00x

Gaudi 3 is 2506.83x faster

VRAM

0.02x

Gaudi 3 has 64.00x more

Memory Bandwidth

0.00x

Gaudi 3 has 256.94x more

NVIDIA Quadro RTX 6000

Compute (FP32-eq)

0.01x

Quadro RTX 6000 is 178.28x faster

FP32 Compute

0.02x

Quadro RTX 6000 is 44.54x faster

VRAM

0.08x

Quadro RTX 6000 has 12.00x more

Memory Bandwidth

0.02x

Quadro RTX 6000 has 46.67x more

NVIDIA T1000

Compute (FP32-eq)

0.15x

T1000 is 6.83x faster

FP32 Compute

0.15x

T1000 is 6.83x faster

VRAM

0.25x

T1000 has 4.00x more

Memory Bandwidth

0.09x

T1000 has 11.11x more

NVIDIA GeForce RTX 2070

Compute (FP32-eq)

0.05x

GeForce RTX 2070 is 20.49x faster

FP32 Compute

0.05x

GeForce RTX 2070 is 20.49x faster

VRAM

0.25x

GeForce RTX 2070 has 4.00x more

Memory Bandwidth

0.03x

GeForce RTX 2070 has 31.11x more

NVIDIA GeForce RTX 3090

Compute (FP32-eq)

0.01x

GeForce RTX 3090 is 96.99x faster

FP32 Compute

0.01x

GeForce RTX 3090 is 97.27x faster

VRAM

0.08x

GeForce RTX 3090 has 12.00x more

Memory Bandwidth

0.02x

GeForce RTX 3090 has 65.00x more

NVIDIA P100

Compute (FP32-eq)

0.04x

P100 is 25.41x faster

FP32 Compute

0.04x

P100 is 25.41x faster

VRAM

0.13x

P100 has 8.00x more

Memory Bandwidth

0.02x

P100 has 50.83x more

To match 1x NVIDIA Quadro RTX 6000

Intel Habana Gaudi 3

Compute (FP32-eq)

0.07x

Gaudi 3 is 14.06x faster

VRAM

0.19x

Gaudi 3 has 5.33x more

Memory Bandwidth

0.18x

Gaudi 3 has 5.51x more

NVIDIA GeForce GT 710

Compute (FP32-eq)

178.28x

Need 178.28x GeForce GT 710

FP32 Compute

44.54x

Need 44.54x GeForce GT 710

VRAM

12.00x

Need 12.00x GeForce GT 710

Memory Bandwidth

46.67x

Need 46.67x GeForce GT 710

NVIDIA T1000

Compute (FP32-eq)

26.10x

Need 26.10x T1000

FP32 Compute

6.52x

Need 6.52x T1000

VRAM

3.00x

Need 3.00x T1000

Memory Bandwidth

4.20x

Need 4.20x T1000

NVIDIA GeForce RTX 2070

Compute (FP32-eq)

8.70x

Need 8.70x GeForce RTX 2070

FP32 Compute

2.17x

Need 2.17x GeForce RTX 2070

VRAM

3.00x

Need 3.00x GeForce RTX 2070

Memory Bandwidth

1.50x

Need 1.50x GeForce RTX 2070

NVIDIA GeForce RTX 3090

Compute (FP32-eq)

1.84x

Need 1.84x GeForce RTX 3090

FP32 Compute

0.46x

GeForce RTX 3090 is 2.18x faster

VRAM

1.00x

GeForce RTX 3090 has 1.00x more

Memory Bandwidth

0.72x

GeForce RTX 3090 has 1.39x more

NVIDIA P100

Compute (FP32-eq)

7.02x

Need 7.02x P100

FP32 Compute

1.75x

Need 1.75x P100

VRAM

1.50x

Need 1.50x P100

Memory Bandwidth

0.92x

P100 has 1.09x more

To match 1x NVIDIA T1000

Intel Habana Gaudi 3

Compute (FP32-eq)

0.00x

Gaudi 3 is 367.00x faster

VRAM

0.06x

Gaudi 3 has 16.00x more

Memory Bandwidth

0.04x

Gaudi 3 has 23.13x more

NVIDIA GeForce GT 710

Compute (FP32-eq)

6.83x

Need 6.83x GeForce GT 710

FP32 Compute

6.83x

Need 6.83x GeForce GT 710

VRAM

4.00x

Need 4.00x GeForce GT 710

Memory Bandwidth

11.11x

Need 11.11x GeForce GT 710

NVIDIA Quadro RTX 6000

Compute (FP32-eq)

0.04x

Quadro RTX 6000 is 26.10x faster

FP32 Compute

0.15x

Quadro RTX 6000 is 6.52x faster

VRAM

0.33x

Quadro RTX 6000 has 3.00x more

Memory Bandwidth

0.24x

Quadro RTX 6000 has 4.20x more

NVIDIA GeForce RTX 2070

Compute (FP32-eq)

0.33x

GeForce RTX 2070 is 3.00x faster

FP32 Compute

0.33x

GeForce RTX 2070 is 3.00x faster

VRAM

1.00x

GeForce RTX 2070 has 1.00x more

Memory Bandwidth

0.36x

GeForce RTX 2070 has 2.80x more

NVIDIA GeForce RTX 3090

Compute (FP32-eq)

0.07x

GeForce RTX 3090 is 14.20x faster

FP32 Compute

0.07x

GeForce RTX 3090 is 14.24x faster

VRAM

0.33x

GeForce RTX 3090 has 3.00x more

Memory Bandwidth

0.17x

GeForce RTX 3090 has 5.85x more

NVIDIA P100

Compute (FP32-eq)

0.27x

P100 is 3.72x faster

FP32 Compute

0.27x

P100 is 3.72x faster

VRAM

0.50x

P100 has 2.00x more

Memory Bandwidth

0.22x

P100 has 4.58x more

To match 1x NVIDIA GeForce RTX 2070

Intel Habana Gaudi 3

Compute (FP32-eq)

0.01x

Gaudi 3 is 122.33x faster

VRAM

0.06x

Gaudi 3 has 16.00x more

Memory Bandwidth

0.12x

Gaudi 3 has 8.26x more

NVIDIA GeForce GT 710

Compute (FP32-eq)

20.49x

Need 20.49x GeForce GT 710

FP32 Compute

20.49x

Need 20.49x GeForce GT 710

VRAM

4.00x

Need 4.00x GeForce GT 710

Memory Bandwidth

31.11x

Need 31.11x GeForce GT 710

NVIDIA Quadro RTX 6000

Compute (FP32-eq)

0.11x

Quadro RTX 6000 is 8.70x faster

FP32 Compute

0.46x

Quadro RTX 6000 is 2.17x faster

VRAM

0.33x

Quadro RTX 6000 has 3.00x more

Memory Bandwidth

0.67x

Quadro RTX 6000 has 1.50x more

NVIDIA T1000

Compute (FP32-eq)

3.00x

Need 3.00x T1000

FP32 Compute

3.00x

Need 3.00x T1000

VRAM

1.00x

T1000 has 1.00x more

Memory Bandwidth

2.80x

Need 2.80x T1000

NVIDIA GeForce RTX 3090

Compute (FP32-eq)

0.21x

GeForce RTX 3090 is 4.73x faster

FP32 Compute

0.21x

GeForce RTX 3090 is 4.75x faster

VRAM

0.33x

GeForce RTX 3090 has 3.00x more

Memory Bandwidth

0.48x

GeForce RTX 3090 has 2.09x more

NVIDIA P100

Compute (FP32-eq)

0.81x

P100 is 1.24x faster

FP32 Compute

0.81x

P100 is 1.24x faster

VRAM

0.50x

P100 has 2.00x more

Memory Bandwidth

0.61x

P100 has 1.63x more

To match 1x NVIDIA GeForce RTX 3090

Intel Habana Gaudi 3

Compute (FP32-eq)

0.04x

Gaudi 3 is 25.85x faster

VRAM

0.19x

Gaudi 3 has 5.33x more

Memory Bandwidth

0.25x

Gaudi 3 has 3.95x more

NVIDIA GeForce GT 710

Compute (FP32-eq)

96.99x

Need 96.99x GeForce GT 710

FP32 Compute

97.27x

Need 97.27x GeForce GT 710

VRAM

12.00x

Need 12.00x GeForce GT 710

Memory Bandwidth

65.00x

Need 65.00x GeForce GT 710

NVIDIA Quadro RTX 6000

Compute (FP32-eq)

0.54x

Quadro RTX 6000 is 1.84x faster

FP32 Compute

2.18x

Need 2.18x Quadro RTX 6000

VRAM

1.00x

Quadro RTX 6000 has 1.00x more

Memory Bandwidth

1.39x

Need 1.39x Quadro RTX 6000

NVIDIA T1000

Compute (FP32-eq)

14.20x

Need 14.20x T1000

FP32 Compute

14.24x

Need 14.24x T1000

VRAM

3.00x

Need 3.00x T1000

Memory Bandwidth

5.85x

Need 5.85x T1000

NVIDIA GeForce RTX 2070

Compute (FP32-eq)

4.73x

Need 4.73x GeForce RTX 2070

FP32 Compute

4.75x

Need 4.75x GeForce RTX 2070

VRAM

3.00x

Need 3.00x GeForce RTX 2070

Memory Bandwidth

2.09x

Need 2.09x GeForce RTX 2070

NVIDIA P100

Compute (FP32-eq)

3.82x

Need 3.82x P100

FP32 Compute

3.83x

Need 3.83x P100

VRAM

1.50x

Need 1.50x P100

Memory Bandwidth

1.28x

Need 1.28x P100

To match 1x NVIDIA P100

Intel Habana Gaudi 3

Compute (FP32-eq)

0.01x

Gaudi 3 is 98.66x faster

VRAM

0.13x

Gaudi 3 has 8.00x more

Memory Bandwidth

0.20x

Gaudi 3 has 5.05x more

NVIDIA GeForce GT 710

Compute (FP32-eq)

25.41x

Need 25.41x GeForce GT 710

FP32 Compute

25.41x

Need 25.41x GeForce GT 710

VRAM

8.00x

Need 8.00x GeForce GT 710

Memory Bandwidth

50.83x

Need 50.83x GeForce GT 710

NVIDIA Quadro RTX 6000

Compute (FP32-eq)

0.14x

Quadro RTX 6000 is 7.02x faster

FP32 Compute

0.57x

Quadro RTX 6000 is 1.75x faster

VRAM

0.67x

Quadro RTX 6000 has 1.50x more

Memory Bandwidth

1.09x

Need 1.09x Quadro RTX 6000

NVIDIA T1000

Compute (FP32-eq)

3.72x

Need 3.72x T1000

FP32 Compute

3.72x

Need 3.72x T1000

VRAM

2.00x

Need 2.00x T1000

Memory Bandwidth

4.58x

Need 4.58x T1000

NVIDIA GeForce RTX 2070

Compute (FP32-eq)

1.24x

Need 1.24x GeForce RTX 2070

FP32 Compute

1.24x

Need 1.24x GeForce RTX 2070

VRAM

2.00x

Need 2.00x GeForce RTX 2070

Memory Bandwidth

1.63x

Need 1.63x GeForce RTX 2070

NVIDIA GeForce RTX 3090

Compute (FP32-eq)

0.26x

GeForce RTX 3090 is 3.82x faster

FP32 Compute

0.26x

GeForce RTX 3090 is 3.83x faster

VRAM

0.67x

GeForce RTX 3090 has 1.50x more

Memory Bandwidth

0.78x

GeForce RTX 3090 has 1.28x more

Pricing

Price Type	Gaudi 3	GeForce GT 710	Quadro RTX 6000	T1000	GeForce RTX 2070	GeForce RTX 3090	P100
CAPEX (Street Price)	$15,000	—	—	—	—	—	—
OPEX (per hour)	$1.20/hr	$0.07/hr	$0.50/hr	$0.17/hr	$0.04/hr	$0.11/hr	$0.28/hr
Price per TFLOPs (FP32-eq)	$16	—	—	—	—	—	—