The World's Most Powerful Data Center GPU
Modern data centers are key to solving some of the world’s most important scientific
and big data challenges using high performance computing (HPC) and artificial intelligence
(AI). NVIDIA® Tesla® accelerated computing platform provides these modern data centers
with the power to accelerate HPC and AI workloads. NVIDIA Pascal GPU-accelerated
servers deliver breakthrough performance with fewer servers resulting in faster
scientific discoveries and insights and dramatically lower costs.
With over 400 HPC applications GPU optimized in a broad range of domains, including
10 of the top 10 HPC applications and all deep learning frameworks, every modern
data center can save money with Tesla platform.
Choose the Right NVIDIA Tesla Solution
NVIDIA® TESLA® P100 for PCIe
World’s most advanced data center accelerator for PCIe-based servers
HPC data centers need to support the ever-growing demands of scientists and researchers
while staying within a tight budget. The old approach of deploying lots of commodity
compute nodes requires huge interconnect overhead that substantially increases costs
without proportionally increasing performance.
NVIDIA Tesla P100 GPU accelerators are the most advanced ever built, powered by
the breakthrough NVIDIA Pascal™ architecture and designed to boost throughput and
save money for HPC and hyperscale data centers. The newest addition to this family,
Tesla P100 for PCIe enables a single node to replace half a rack of commodity CPU
nodes by delivering lightning-fast performance in a broad range of HPC applications.
MASSIVE LEAP IN PERFORMANCE
|
|
SPECIFICATIONS
GPU Architecture
|
NVIDIA Pascal™
|
NVIDIA CUDA® Cores
|
3584
|
Double-Precision Performance
|
4.7 TeraFLOPS
|
Single-Precision Performance
|
9.3 TeraFLOPS
|
Half-Precision Performance
|
18.7 TeraFLOPS
|
GPU Memory
|
16GB CoWoS HBM2 at 720 GB/s or
12GB CoWoS HBM2 at 540 GB/s
|
System Interface
|
PCIe Gen3
|
Max Power Consumption
|
250 W
|
ECC
|
Yes
|
Thermal Solution
|
Passive
|
Form Factor
|
PCIe Full Height/Length
|
Compute APIs
|
CUDA, DirectCompute, OpenCL™, OpenACC
|
TeraFLOPS measurements with NVIDIA GPU Boost™ technology
|
|
NVIDIA® TESLA® P100 with NVLINK
Infinite compute power for the modern data center
Artificial intelligence for self-driving cars. Predicting our climate’s future.
A newdrug to treat cancer. The world’s most important challenges require tremendousamounts
of computing to become reality. But today’s data centers rely on many interconnected
commodity compute nodes, limiting the performance needed to drive important HPC
and hyperscale workloads.
The NVIDIA Tesla P100 is the most advanced data center accelerator ever built, leveraging
the groundbreaking NVIDIA Pascal™ GPU architecture to deliver the world’s fastest
compute node. It’s powered by four innovative technologies with huge jumps in performance
for HPC and deep learning workloads.
The Tesla P100 also features NVIDIA NVLink™ technology that enables superior strong-scaling
performance for HPC and hyperscale applications. Up to eight Tesla P100 GPUs interconnected
in a single node can deliver the performance of racks of commodity CPU servers.
TESLA P100 AND NVLINK DELIVERS UP TO 50X PERFORMANCE BOOST FOR DATA CENTER APPLICATIONS
|
|
SPECIFICATIONS
GPU Architecture
|
NVIDIA Pascal™
|
NVIDIA CUDA® Cores
|
3584
|
Double-Precision Performance
|
5.3 TeraFLOPS
|
Single-Precision Performance
|
10.6 TeraFLOPS
|
Half-Precision Performance
|
21.2 TeraFLOPS
|
GPU Memory
|
16GB CoWoS HBM2
|
Memory Bandwidth
|
732 GB/s
|
Interconnect
|
NVIDIA NVLink
|
Max Power Consumption
|
300 W
|
ECC
|
Native support with no capacity or performance overhead
|
Thermal Solution
|
Passive
|
Form Factor
|
SXM2
|
Compute APIs
|
CUDA, DirectCompute, OpenCL™, OpenACC
|
TeraFLOPS measurements with NVIDIA GPU Boost™ technology
|
|
NVIDIA® TESLA® P40
Experience maximum inference throughput
In the new era of AI and intelligent machines, deep learning is shaping our world
like no other computing model in history. GPUs powered by the revolutionary NVIDIA
Pascal™ architecture provide the computational engine for the new era of artificial
intelligence, enabling amazing user experiences by accelerating deep learning applications
at scale.
The NVIDIA Tesla P40 is purpose-built to deliver maximum throughput for deep learning
deployment. With 47 TOPS (Tera-Operations Per Second) of inference performance and
INT8 operations per GPU, a single server with 8 Tesla P40s delivers the performance
of over 140 CPU servers.
As models increase in accuracy and complexity, CPUs are no longer capable of delivering
interactive user experience. The Tesla P40 delivers over 30X lower latency than
a CPU for real-time responsiveness in even the most complex models.
|
|
SPECIFICATIONS
GPU Architecture
|
NVIDIA Pascal™
|
Single-Precision Performance
|
12 TeraFLOPS*
|
Integer Operations (INT8)
|
47 TOPS* (TeraOperations per Second)
|
GPU Memory
|
24GB
|
Memory Bandwidth
|
346 GB/s
|
System Interface
|
PCI Express 3.0 x16
|
Form Factor
|
4.4" H x 10.5" L, Dual Slot, Full Height
|
Max Power Consumption
|
250 W
|
Enhanced Programmability with Page Migration Engine
|
Yes
|
ECC
|
Yes
|
Server-Optimized for Data Center Deployment
|
Yes
|
Hardware-Accelerated Video Engine
|
1x Decode Engine, 2x Encode Engine
|
* With Boost Clock enabled
|
|
NVIDIA® TESLA® P4
Ultra-efficient deep learning in scale-out servers
In the new era of AI and intelligent machines, deep learning is shaping our world
like no other computing model in history. Interactive speech, visual search, and
video recommendations are a few of many AI-based services that we use every day.
Accuracy and responsiveness are key to user adoption for these services. As deep
learning models increase in accuracy and complexity, CPUs are no longer capable
of delivering a responsive user experience.
The NVIDIA Tesla P4 is powered by the revolutionary NVIDIA Pascal™ architecture
and purpose-built to boost efficiency for scale-out servers running deep learning
workloads, enabling smart responsive AI-based services. It slashes inference latency
by 15X in any hyperscale infrastructure and provides an incredible 60X better energy
efficiency than CPUs. This unlocks a new wave of AI services previous impossible
due to latency limitations.
|
|
SPECIFICATIONS
GPU Architecture
|
NVIDIA Pascal™
|
Single-Precision Performance
|
5.5 TeraFLOPS*
|
Integer Operations (INT8)
|
22 TOPS* (TeraOperations per Second)
|
GPU Memory
|
8GB
|
Memory Bandwidth
|
192 GB/s
|
System Interface
|
Low-Profile PCI Express Form Factor
|
Max Power Consumption
|
50W/75W
|
Enhanced Programmability with Page Migration Engine
|
Yes
|
ECC
|
Yes
|
Server-Optimized for Data Center Deployment
|
Yes
|
Hardware-Accelerated Video Engine
|
1x Decode Engine, 2x Encode Engine
|
* With Boost Clock enabled
|
|