Compare XPUs

Select up to 5 XPUs to compare side-by-side

Select XPUs to Compare

Clear all (6)Jump to results

Filter by Vendor

Showing 128 XPUs • 6 selected

Alibaba

Hanguang 800

AMD

MI100

23.1 TFLOPs

AMD

MI210

181 TFLOPs

AMD

MI250X

383 TFLOPs

AMD

MI300X

1,307 TFLOPs

AMD

MI325X

1,400 TFLOPs

AMD

MI350X

2,100 TFLOPs

AMD

MI355X

5,300 TFLOPs

AMD

Radeon Pro V520

23.04 TFLOPs

AWS

Inferentia2

190 TFLOPs

AWS

Trainium

190 TFLOPs

AWS

Trainium2

680 TFLOPs

Baidu

Kunlun II

Biren Technology

BR100

Cambricon

MLU370

256 TFLOPs

Cerebras

WSE-3

Enflame Technology

CloudBlazer T20

FuriosaAI

RNGD (Renegade)

256 TFLOPs

FuriosaAI

Warboy

Google

TPU v4

275 TFLOPs

Google

TPU v5e

197 TFLOPs

Google

TPU v5p

459 TFLOPs

Graphcore

Bow IPU

Graphcore

IPU-M2000

Groq

LPU Inference Engine

Huawei

Ascend 910B

Iluvatar CoreX

BI-V150

300 TFLOPs

Intel

Data Center GPU Max 1100

177 TFLOPs

Intel

Data Center GPU Max 1550

419 TFLOPs

Intel Habana

Gaudi 2

432 TFLOPs

Intel Habana

Gaudi 3

1,835 TFLOPs

Multi-Metric Comparison

Relative performance across 5 key metrics (normalized to 100 = best in comparison)

Compute Performance (BF16)

Memory Capacity

Power Consumption

Power Efficiency

Specifications

Specification	Baidu Kunlun II	NVIDIA H100 SXM	NVIDIA RTX 6000 Ada Generation	NVIDIA GeForce GT 710	NVIDIA GeForce RTX 3090	NVIDIA GH200
Architecture	Kunlun Core	Hopper	Ada Lovelace	Kepler	Ampere	Hopper
Form Factor	—	SXM	PCIe	PCIe	PCIe	SXM
VRAM	32 GB	80 GB	48 GB	2 GB	24 GB	96 GB
Memory Bandwidth	—	3,350 GB/s	960 GB/s	14.4 GB/s	936 GB/s	4,000 GB/s
TFLOPs (FP32)	—	67	91.1	0.366	35.6	67
TFLOPs (FP16)	256	1,979	—	—	—	—
TFLOPs	—	1,979	182.5	0.366	71	989
TFLOPs (FP8)	—	3,958	—	—	—	—
TDP	200 W	700 W	300 W	19 W	350 W	1000 W
Launch Date	Aug 2021	Sep 2022	Sep 2022	Mar 2014	Sep 2020	May 2023

Efficiency Metrics

Metric	Kunlun II	H100 SXM	RTX 6000 Ada Generation	GeForce GT 710	GeForce RTX 3090	GH200
TFLOPs per Watt (FP32-eq)	—	1.41	0.30	0.02	0.10	0.49
Memory Bandwidth per GB	—	41.9 GB/s	20.0 GB/s	7.2 GB/s	39.0 GB/s	41.7 GB/s

Performance Equivalence

How many units of each GPU are needed to match the performance of the others?

To match 1x Baidu Kunlun II

NVIDIA H100 SXM

VRAM

0.40x

H100 SXM has 2.50x more

NVIDIA RTX 6000 Ada Generation

VRAM

0.67x

RTX 6000 Ada Generation has 1.50x more

NVIDIA GeForce GT 710

VRAM

16.00x

Need 16.00x GeForce GT 710

NVIDIA GeForce RTX 3090

VRAM

1.33x

Need 1.33x GeForce RTX 3090

NVIDIA GH200

VRAM

0.33x

GH200 has 3.00x more

To match 1x NVIDIA H100 SXM

Baidu Kunlun II

VRAM

2.50x

Need 2.50x Kunlun II

NVIDIA RTX 6000 Ada Generation

Compute (FP32-eq)

10.84x

Need 10.84x RTX 6000 Ada Generation

FP32 Compute

0.74x

RTX 6000 Ada Generation is 1.36x faster

VRAM

1.67x

Need 1.67x RTX 6000 Ada Generation

Memory Bandwidth

3.49x

Need 3.49x RTX 6000 Ada Generation

NVIDIA GeForce GT 710

Compute (FP32-eq)

2703.55x

Need 2703.55x GeForce GT 710

FP32 Compute

183.06x

Need 183.06x GeForce GT 710

VRAM

40.00x

Need 40.00x GeForce GT 710

Memory Bandwidth

232.64x

Need 232.64x GeForce GT 710

NVIDIA GeForce RTX 3090

Compute (FP32-eq)

27.87x

Need 27.87x GeForce RTX 3090

FP32 Compute

1.88x

Need 1.88x GeForce RTX 3090

VRAM

3.33x

Need 3.33x GeForce RTX 3090

Memory Bandwidth

3.58x

Need 3.58x GeForce RTX 3090

NVIDIA GH200

Compute (FP32-eq)

2.00x

Need 2.00x GH200

FP32 Compute

1.00x

GH200 is 1.00x faster

VRAM

0.83x

GH200 has 1.20x more

Memory Bandwidth

0.84x

GH200 has 1.19x more

To match 1x NVIDIA RTX 6000 Ada Generation

Baidu Kunlun II

VRAM

1.50x

Need 1.50x Kunlun II

NVIDIA H100 SXM

Compute (FP32-eq)

0.09x

H100 SXM is 10.84x faster

FP32 Compute

1.36x

Need 1.36x H100 SXM

VRAM

0.60x

H100 SXM has 1.67x more

Memory Bandwidth

0.29x

H100 SXM has 3.49x more

NVIDIA GeForce GT 710

Compute (FP32-eq)

249.32x

Need 249.32x GeForce GT 710

FP32 Compute

248.91x

Need 248.91x GeForce GT 710

VRAM

24.00x

Need 24.00x GeForce GT 710

Memory Bandwidth

66.67x

Need 66.67x GeForce GT 710

NVIDIA GeForce RTX 3090

Compute (FP32-eq)

2.57x

Need 2.57x GeForce RTX 3090

FP32 Compute

2.56x

Need 2.56x GeForce RTX 3090

VRAM

2.00x

Need 2.00x GeForce RTX 3090

Memory Bandwidth

1.03x

Need 1.03x GeForce RTX 3090

NVIDIA GH200

Compute (FP32-eq)

0.18x

GH200 is 5.42x faster

FP32 Compute

1.36x

Need 1.36x GH200

VRAM

0.50x

GH200 has 2.00x more

Memory Bandwidth

0.24x

GH200 has 4.17x more

To match 1x NVIDIA GeForce GT 710

Baidu Kunlun II

VRAM

0.06x

Kunlun II has 16.00x more

NVIDIA H100 SXM

Compute (FP32-eq)

0.00x

H100 SXM is 2703.55x faster

FP32 Compute

0.01x

H100 SXM is 183.06x faster

VRAM

0.03x

H100 SXM has 40.00x more

Memory Bandwidth

0.00x

H100 SXM has 232.64x more

NVIDIA RTX 6000 Ada Generation

Compute (FP32-eq)

0.00x

RTX 6000 Ada Generation is 249.32x faster

FP32 Compute

0.00x

RTX 6000 Ada Generation is 248.91x faster

VRAM

0.04x

RTX 6000 Ada Generation has 24.00x more

Memory Bandwidth

0.02x

RTX 6000 Ada Generation has 66.67x more

NVIDIA GeForce RTX 3090

Compute (FP32-eq)

0.01x

GeForce RTX 3090 is 96.99x faster

FP32 Compute

0.01x

GeForce RTX 3090 is 97.27x faster

VRAM

0.08x

GeForce RTX 3090 has 12.00x more

Memory Bandwidth

0.02x

GeForce RTX 3090 has 65.00x more

NVIDIA GH200

Compute (FP32-eq)

0.00x

GH200 is 1351.09x faster

FP32 Compute

0.01x

GH200 is 183.06x faster

VRAM

0.02x

GH200 has 48.00x more

Memory Bandwidth

0.00x

GH200 has 277.78x more

To match 1x NVIDIA GeForce RTX 3090

Baidu Kunlun II

VRAM

0.75x

Kunlun II has 1.33x more

NVIDIA H100 SXM

Compute (FP32-eq)

0.04x

H100 SXM is 27.87x faster

FP32 Compute

0.53x

H100 SXM is 1.88x faster

VRAM

0.30x

H100 SXM has 3.33x more

Memory Bandwidth

0.28x

H100 SXM has 3.58x more

NVIDIA RTX 6000 Ada Generation

Compute (FP32-eq)

0.39x

RTX 6000 Ada Generation is 2.57x faster

FP32 Compute

0.39x

RTX 6000 Ada Generation is 2.56x faster

VRAM

0.50x

RTX 6000 Ada Generation has 2.00x more

Memory Bandwidth

0.97x

RTX 6000 Ada Generation has 1.03x more

NVIDIA GeForce GT 710

Compute (FP32-eq)

96.99x

Need 96.99x GeForce GT 710

FP32 Compute

97.27x

Need 97.27x GeForce GT 710

VRAM

12.00x

Need 12.00x GeForce GT 710

Memory Bandwidth

65.00x

Need 65.00x GeForce GT 710

NVIDIA GH200

Compute (FP32-eq)

0.07x

GH200 is 13.93x faster

FP32 Compute

0.53x

GH200 is 1.88x faster

VRAM

0.25x

GH200 has 4.00x more

Memory Bandwidth

0.23x

GH200 has 4.27x more

To match 1x NVIDIA GH200

Baidu Kunlun II

VRAM

3.00x

Need 3.00x Kunlun II

NVIDIA H100 SXM

Compute (FP32-eq)

0.50x

H100 SXM is 2.00x faster

FP32 Compute

1.00x

H100 SXM is 1.00x faster

VRAM

1.20x

Need 1.20x H100 SXM

Memory Bandwidth

1.19x

Need 1.19x H100 SXM

NVIDIA RTX 6000 Ada Generation

Compute (FP32-eq)

5.42x

Need 5.42x RTX 6000 Ada Generation

FP32 Compute

0.74x

RTX 6000 Ada Generation is 1.36x faster

VRAM

2.00x

Need 2.00x RTX 6000 Ada Generation

Memory Bandwidth

4.17x

Need 4.17x RTX 6000 Ada Generation

NVIDIA GeForce GT 710

Compute (FP32-eq)

1351.09x

Need 1351.09x GeForce GT 710

FP32 Compute

183.06x

Need 183.06x GeForce GT 710

VRAM

48.00x

Need 48.00x GeForce GT 710

Memory Bandwidth

277.78x

Need 277.78x GeForce GT 710

NVIDIA GeForce RTX 3090

Compute (FP32-eq)

13.93x

Need 13.93x GeForce RTX 3090

FP32 Compute

1.88x

Need 1.88x GeForce RTX 3090

VRAM

4.00x

Need 4.00x GeForce RTX 3090

Memory Bandwidth

4.27x

Need 4.27x GeForce RTX 3090

Pricing

Price Type	Kunlun II	H100 SXM	RTX 6000 Ada Generation	GeForce GT 710	GeForce RTX 3090	GH200
CAPEX (Street Price)	—	$30,000	—	—	—	—
OPEX (per hour)	—	$3.50/hr	$0.33/hr	$0.07/hr	$0.11/hr	$1.49/hr
Price per TFLOPs (FP32-eq)	—	$30	—	—	—	—