Compare XPUs

Select up to 5 XPUs to compare side-by-side

Select XPUs to Compare

Clear all (8)Jump to results

Filter by Vendor

Showing 128 XPUs • 8 selected

Alibaba

Hanguang 800

AMD

MI100

23.1 TFLOPs

AMD

MI210

181 TFLOPs

AMD

MI250X

383 TFLOPs

AMD

MI300X

1,307 TFLOPs

AMD

MI325X

1,400 TFLOPs

AMD

MI350X

2,100 TFLOPs

AMD

MI355X

5,300 TFLOPs

AMD

Radeon Pro V520

23.04 TFLOPs

AWS

Inferentia2

190 TFLOPs

AWS

Trainium

190 TFLOPs

AWS

Trainium2

680 TFLOPs

Baidu

Kunlun II

Biren Technology

BR100

Cambricon

MLU370

256 TFLOPs

Cerebras

WSE-3

Enflame Technology

CloudBlazer T20

FuriosaAI

RNGD (Renegade)

256 TFLOPs

FuriosaAI

Warboy

Google

TPU v4

275 TFLOPs

Google

TPU v5e

197 TFLOPs

Google

TPU v5p

459 TFLOPs

Graphcore

Bow IPU

Graphcore

IPU-M2000

Groq

LPU Inference Engine

Huawei

Ascend 910B

Iluvatar CoreX

BI-V150

300 TFLOPs

Intel

Data Center GPU Max 1100

177 TFLOPs

Intel

Data Center GPU Max 1550

419 TFLOPs

Intel Habana

Gaudi 2

432 TFLOPs

Intel Habana

Gaudi 3

1,835 TFLOPs

Multi-Metric Comparison

Relative performance across 5 key metrics (normalized to 100 = best in comparison)

Compute Performance (BF16)

Memory Capacity

Power Consumption

Power Efficiency

Specifications

Specification	Tenstorrent Wormhole	Enflame Technology CloudBlazer T20	Google TPU v5p	NVIDIA RTX 5000 Ada Generation	NVIDIA T4G	NVIDIA Quadro P6000	NVIDIA GeForce RTX 3090	NVIDIA A100 SXM
Architecture	Tensix Core	GCU (AI Graphics Unit)	TPU v5	Ada Lovelace	Turing	Pascal	Ampere	Ampere
Form Factor	—	—	Mezzanine	PCIe	PCIe	PCIe	PCIe	SXM
VRAM	24 GB	32 GB	95 GB	32 GB	16 GB	24 GB	24 GB	80 GB
Memory Bandwidth	—	—	—	576 GB/s	320 GB/s	432 GB/s	936 GB/s	2,039 GB/s
TFLOPs (FP32)	—	—	—	65.3	8.1	12.63	35.6	19.5
TFLOPs (FP16)	—	—	—	—	—	—	—	312
TFLOPs	364	—	459	130	65	12.63	71	312
TFLOPs (FP8)	—	—	—	—	—	—	—	—
TDP	160 W	250 W	400 W	250 W	70 W	250 W	350 W	400 W
Launch Date	Oct 2023	Nov 2021	Aug 2023	Mar 2023	May 2020	Oct 2016	Sep 2020	May 2020

Efficiency Metrics

Metric	Wormhole	CloudBlazer T20	TPU v5p	RTX 5000 Ada Generation	T4G	Quadro P6000	GeForce RTX 3090	A100 SXM
TFLOPs per Watt (FP32-eq)	1.14	—	0.57	0.26	0.46	0.05	0.10	0.39
Memory Bandwidth per GB	—	—	—	18.0 GB/s	20.0 GB/s	18.0 GB/s	39.0 GB/s	25.5 GB/s

Performance Equivalence

How many units of each GPU are needed to match the performance of the others?

To match 1x Tenstorrent Wormhole

Enflame Technology CloudBlazer T20

VRAM

0.75x

CloudBlazer T20 has 1.33x more

Google TPU v5p

Compute (FP32-eq)

0.79x

TPU v5p is 1.26x faster

VRAM

0.25x

TPU v5p has 3.96x more

NVIDIA RTX 5000 Ada Generation

Compute (FP32-eq)

2.80x

Need 2.80x RTX 5000 Ada Generation

VRAM

0.75x

RTX 5000 Ada Generation has 1.33x more

NVIDIA T4G

Compute (FP32-eq)

5.60x

Need 5.60x T4G

VRAM

1.50x

Need 1.50x T4G

NVIDIA Quadro P6000

Compute (FP32-eq)

14.41x

Need 14.41x Quadro P6000

VRAM

1.00x

Quadro P6000 has 1.00x more

NVIDIA GeForce RTX 3090

Compute (FP32-eq)

5.13x

Need 5.13x GeForce RTX 3090

VRAM

1.00x

GeForce RTX 3090 has 1.00x more

NVIDIA A100 SXM

Compute (FP32-eq)

1.17x

Need 1.17x A100 SXM

VRAM

0.30x

A100 SXM has 3.33x more

To match 1x Enflame Technology CloudBlazer T20

Tenstorrent Wormhole

VRAM

1.33x

Need 1.33x Wormhole

Google TPU v5p

VRAM

0.34x

TPU v5p has 2.97x more

NVIDIA RTX 5000 Ada Generation

VRAM

1.00x

RTX 5000 Ada Generation has 1.00x more

NVIDIA T4G

VRAM

2.00x

Need 2.00x T4G

NVIDIA Quadro P6000

VRAM

1.33x

Need 1.33x Quadro P6000

NVIDIA GeForce RTX 3090

VRAM

1.33x

Need 1.33x GeForce RTX 3090

NVIDIA A100 SXM

VRAM

0.40x

A100 SXM has 2.50x more

To match 1x Google TPU v5p

Tenstorrent Wormhole

Compute (FP32-eq)

1.26x

Need 1.26x Wormhole

VRAM

3.96x

Need 3.96x Wormhole

Enflame Technology CloudBlazer T20

VRAM

2.97x

Need 2.97x CloudBlazer T20

NVIDIA RTX 5000 Ada Generation

Compute (FP32-eq)

3.53x

Need 3.53x RTX 5000 Ada Generation

VRAM

2.97x

Need 2.97x RTX 5000 Ada Generation

NVIDIA T4G

Compute (FP32-eq)

7.06x

Need 7.06x T4G

VRAM

5.94x

Need 5.94x T4G

NVIDIA Quadro P6000

Compute (FP32-eq)

18.17x

Need 18.17x Quadro P6000

VRAM

3.96x

Need 3.96x Quadro P6000

NVIDIA GeForce RTX 3090

Compute (FP32-eq)

6.46x

Need 6.46x GeForce RTX 3090

VRAM

3.96x

Need 3.96x GeForce RTX 3090

NVIDIA A100 SXM

Compute (FP32-eq)

1.47x

Need 1.47x A100 SXM

VRAM

1.19x

Need 1.19x A100 SXM

To match 1x NVIDIA RTX 5000 Ada Generation

Tenstorrent Wormhole

Compute (FP32-eq)

0.36x

Wormhole is 2.80x faster

VRAM

1.33x

Need 1.33x Wormhole

Enflame Technology CloudBlazer T20

VRAM

1.00x

CloudBlazer T20 has 1.00x more

Google TPU v5p

Compute (FP32-eq)

0.28x

TPU v5p is 3.53x faster

VRAM

0.34x

TPU v5p has 2.97x more

NVIDIA T4G

Compute (FP32-eq)

2.00x

Need 2.00x T4G

FP32 Compute

8.06x

Need 8.06x T4G

VRAM

2.00x

Need 2.00x T4G

Memory Bandwidth

1.80x

Need 1.80x T4G

NVIDIA Quadro P6000

Compute (FP32-eq)

5.15x

Need 5.15x Quadro P6000

FP32 Compute

5.17x

Need 5.17x Quadro P6000

VRAM

1.33x

Need 1.33x Quadro P6000

Memory Bandwidth

1.33x

Need 1.33x Quadro P6000

NVIDIA GeForce RTX 3090

Compute (FP32-eq)

1.83x

Need 1.83x GeForce RTX 3090

FP32 Compute

1.83x

Need 1.83x GeForce RTX 3090

VRAM

1.33x

Need 1.33x GeForce RTX 3090

Memory Bandwidth

0.62x

GeForce RTX 3090 has 1.63x more

NVIDIA A100 SXM

Compute (FP32-eq)

0.42x

A100 SXM is 2.40x faster

FP32 Compute

3.35x

Need 3.35x A100 SXM

VRAM

0.40x

A100 SXM has 2.50x more

Memory Bandwidth

0.28x

A100 SXM has 3.54x more

To match 1x NVIDIA T4G

Tenstorrent Wormhole

Compute (FP32-eq)

0.18x

Wormhole is 5.60x faster

VRAM

0.67x

Wormhole has 1.50x more

Enflame Technology CloudBlazer T20

VRAM

0.50x

CloudBlazer T20 has 2.00x more

Google TPU v5p

Compute (FP32-eq)

0.14x

TPU v5p is 7.06x faster

VRAM

0.17x

TPU v5p has 5.94x more

NVIDIA RTX 5000 Ada Generation

Compute (FP32-eq)

0.50x

RTX 5000 Ada Generation is 2.00x faster

FP32 Compute

0.12x

RTX 5000 Ada Generation is 8.06x faster

VRAM

0.50x

RTX 5000 Ada Generation has 2.00x more

Memory Bandwidth

0.56x

RTX 5000 Ada Generation has 1.80x more

NVIDIA Quadro P6000

Compute (FP32-eq)

2.57x

Need 2.57x Quadro P6000

FP32 Compute

0.64x

Quadro P6000 is 1.56x faster

VRAM

0.67x

Quadro P6000 has 1.50x more

Memory Bandwidth

0.74x

Quadro P6000 has 1.35x more

NVIDIA GeForce RTX 3090

Compute (FP32-eq)

0.92x

GeForce RTX 3090 is 1.09x faster

FP32 Compute

0.23x

GeForce RTX 3090 is 4.40x faster

VRAM

0.67x

GeForce RTX 3090 has 1.50x more

Memory Bandwidth

0.34x

GeForce RTX 3090 has 2.92x more

NVIDIA A100 SXM

Compute (FP32-eq)

0.21x

A100 SXM is 4.80x faster

FP32 Compute

0.42x

A100 SXM is 2.41x faster

VRAM

0.20x

A100 SXM has 5.00x more

Memory Bandwidth

0.16x

A100 SXM has 6.37x more

To match 1x NVIDIA Quadro P6000

Tenstorrent Wormhole

Compute (FP32-eq)

0.07x

Wormhole is 14.41x faster

VRAM

1.00x

Wormhole has 1.00x more

Enflame Technology CloudBlazer T20

VRAM

0.75x

CloudBlazer T20 has 1.33x more

Google TPU v5p

Compute (FP32-eq)

0.06x

TPU v5p is 18.17x faster

VRAM

0.25x

TPU v5p has 3.96x more

NVIDIA RTX 5000 Ada Generation

Compute (FP32-eq)

0.19x

RTX 5000 Ada Generation is 5.15x faster

FP32 Compute

0.19x

RTX 5000 Ada Generation is 5.17x faster

VRAM

0.75x

RTX 5000 Ada Generation has 1.33x more

Memory Bandwidth

0.75x

RTX 5000 Ada Generation has 1.33x more

NVIDIA T4G

Compute (FP32-eq)

0.39x

T4G is 2.57x faster

FP32 Compute

1.56x

Need 1.56x T4G

VRAM

1.50x

Need 1.50x T4G

Memory Bandwidth

1.35x

Need 1.35x T4G

NVIDIA GeForce RTX 3090

Compute (FP32-eq)

0.36x

GeForce RTX 3090 is 2.81x faster

FP32 Compute

0.35x

GeForce RTX 3090 is 2.82x faster

VRAM

1.00x

GeForce RTX 3090 has 1.00x more

Memory Bandwidth

0.46x

GeForce RTX 3090 has 2.17x more

NVIDIA A100 SXM

Compute (FP32-eq)

0.08x

A100 SXM is 12.35x faster

FP32 Compute

0.65x

A100 SXM is 1.54x faster

VRAM

0.30x

A100 SXM has 3.33x more

Memory Bandwidth

0.21x

A100 SXM has 4.72x more

To match 1x NVIDIA GeForce RTX 3090

Tenstorrent Wormhole

Compute (FP32-eq)

0.20x

Wormhole is 5.13x faster

VRAM

1.00x

Wormhole has 1.00x more

Enflame Technology CloudBlazer T20

VRAM

0.75x

CloudBlazer T20 has 1.33x more

Google TPU v5p

Compute (FP32-eq)

0.15x

TPU v5p is 6.46x faster

VRAM

0.25x

TPU v5p has 3.96x more

NVIDIA RTX 5000 Ada Generation

Compute (FP32-eq)

0.55x

RTX 5000 Ada Generation is 1.83x faster

FP32 Compute

0.55x

RTX 5000 Ada Generation is 1.83x faster

VRAM

0.75x

RTX 5000 Ada Generation has 1.33x more

Memory Bandwidth

1.63x

Need 1.63x RTX 5000 Ada Generation

NVIDIA T4G

Compute (FP32-eq)

1.09x

Need 1.09x T4G

FP32 Compute

4.40x

Need 4.40x T4G

VRAM

1.50x

Need 1.50x T4G

Memory Bandwidth

2.92x

Need 2.92x T4G

NVIDIA Quadro P6000

Compute (FP32-eq)

2.81x

Need 2.81x Quadro P6000

FP32 Compute

2.82x

Need 2.82x Quadro P6000

VRAM

1.00x

Quadro P6000 has 1.00x more

Memory Bandwidth

2.17x

Need 2.17x Quadro P6000

NVIDIA A100 SXM

Compute (FP32-eq)

0.23x

A100 SXM is 4.39x faster

FP32 Compute

1.83x

Need 1.83x A100 SXM

VRAM

0.30x

A100 SXM has 3.33x more

Memory Bandwidth

0.46x

A100 SXM has 2.18x more

To match 1x NVIDIA A100 SXM

Tenstorrent Wormhole

Compute (FP32-eq)

0.86x

Wormhole is 1.17x faster

VRAM

3.33x

Need 3.33x Wormhole

Enflame Technology CloudBlazer T20

VRAM

2.50x

Need 2.50x CloudBlazer T20

Google TPU v5p

Compute (FP32-eq)

0.68x

TPU v5p is 1.47x faster

VRAM

0.84x

TPU v5p has 1.19x more

NVIDIA RTX 5000 Ada Generation

Compute (FP32-eq)

2.40x

Need 2.40x RTX 5000 Ada Generation

FP32 Compute

0.30x

RTX 5000 Ada Generation is 3.35x faster

VRAM

2.50x

Need 2.50x RTX 5000 Ada Generation

Memory Bandwidth

3.54x

Need 3.54x RTX 5000 Ada Generation

NVIDIA T4G

Compute (FP32-eq)

4.80x

Need 4.80x T4G

FP32 Compute

2.41x

Need 2.41x T4G

VRAM

5.00x

Need 5.00x T4G

Memory Bandwidth

6.37x

Need 6.37x T4G

NVIDIA Quadro P6000

Compute (FP32-eq)

12.35x

Need 12.35x Quadro P6000

FP32 Compute

1.54x

Need 1.54x Quadro P6000

VRAM

3.33x

Need 3.33x Quadro P6000

Memory Bandwidth

4.72x

Need 4.72x Quadro P6000

NVIDIA GeForce RTX 3090

Compute (FP32-eq)

4.39x

Need 4.39x GeForce RTX 3090

FP32 Compute

0.55x

GeForce RTX 3090 is 1.83x faster

VRAM

3.33x

Need 3.33x GeForce RTX 3090

Memory Bandwidth

2.18x

Need 2.18x GeForce RTX 3090

Pricing

Price Type	Wormhole	CloudBlazer T20	TPU v5p	RTX 5000 Ada Generation	T4G	Quadro P6000	GeForce RTX 3090	A100 SXM
CAPEX (Street Price)	—	—	—	—	—	—	—	$15,000
OPEX (per hour)	—	—	$5.00/hr	—	$0.42/hr	$1.10/hr	$0.11/hr	$4.05/hr
Price per TFLOPs (FP32-eq)	—	—	—	—	—	—	—	$96