Bring accelerated performance to every enterprise workload with NVIDIA A40 Tensor Core GPUs. With NVIDIA Ampere architecture Tensor Cores and Multi-Instance GPU (MIG), it delivers speedups securely across diverse workloads, including AI inference at scale and high-performance computing (HPC) applications. By combining fast memory bandwidth and low-power consumption in a PCIe form factor—optimal for mainstream servers—A30 enables an elastic data center and delivers maximum value for enterprises.
Ships in 7 days after payment. All sales final. No returns or cancellations. For volume pricing, consult a live chat agent or call our toll-free number.
The NVIDIA Ampere architecture is part of the unified NVIDIA EGX™ platform, incorporating building blocks across hardware, networking, software, libraries, and optimized AI models and applications from the NVIDIA NGC™ catalog. Representing the most powerful end-to-end AI and HPC platform for data centers, it allows researchers to rapidly deliver real-world results and deploy solutions into production at scale.
Training AI models for next-level challenges such as conversational AI requires massive compute power and scalability.
NVIDIA A40 Tensor Cores with Tensor Float (TF32) provide up to 10X higher performance over the NVIDIA T4 with zero code changes and an additional 2X boost with automatic mixed precision and FP16, delivering a combined 20X throughput increase. When combined with NVIDIA® NVLink®, PCIe Gen4, NVIDIA networking, and the NVIDIA Magnum IO™ SDK, it’s possible to scale to thousands of GPUs.
Tensor Cores and MIG enable A30 to be used for workloads dynamically throughout the day. It can be used for production inference at peak demand, and part of the GPU can be repurposed to rapidly re-train those very same models during off-peak hours.
NVIDIA set multiple performance records in MLPerf, the industry-wide benchmark for AI training.
FP64 | 5.2 teraFLOPS |
FP64 Tensor Core | 10.3 teraFLOPS |
FP32 | 10.3 teraFLOPS |
TF32 Tensor Core | 82 teraFLOPS | 165 teraFLOPS* |
BFLOAT16 Tensor Core | 165 teraFLOPS | 330 teraFLOPS* |
FP16 Tensor Core | 165 teraFLOPS | 330 teraFLOPS* |
INT8 Tensor Core | 330 TOPS | 661 TOPS* |
INT4 Tensor Core | 661 TOPS | 1321 TOPS* |
Media engines | 1 optical flow accelerator (OFA) 1 JPEG decoder (NVJPEG) 4 video decoders (NVDEC) |
GPU memory | 48GB HBM2 |
GPU memory bandwidth | 933GB/s |
Interconnect | PCIe Gen4: 64GB/s Third-gen NVLINK: 200GB/s** |
Form factor | Dual-slot, full-height, full-length (FHFL) |
Max thermal design power (TDP) | 165W |
Multi-Instance GPU (MIG) | 4 GPU instances @ 6GB each 2 GPU instances @ 12GB each 1 GPU instance @ 24GB |
Virtual GPU (vGPU) software support | NVIDIA AI Enterprise NVIDIA Virtual Compute Server |
No reviews available.