The Amazon Elastic Compute Cloud (Amazon EC2) accelerator portfolio offers the broadest selection of accelerators to power artificial intelligence (AI), machine learning (ML), graphics, and high-performance computing (HPC) workloads. We’re pleased to announce the expansion of this portfolio with three new instances featuring NVIDIA’s latest GPUs: Amazon EC2 P5e instances powered by NVIDIA H200 GPUs, Amazon EC2 G6 instances featuring NVIDIA L4 GPUs, and Amazon EC2 G6e powered instances LVIDIA. All three cases will be available in 2024, and we can’t wait to see what you can do with them.
AWS and NVIDIA have partnered for more than 13 years and have pioneered large-scale, high-performance, and cost-effective GPU-based solutions for developers and enterprises across the spectrum. We combined NVIDIA’s powerful GPUs with differentiated AWS technologies such as the AWS Nitro system, 3,200 Gbps of Elastic Fabric Adapter (EFA) v2 networking, hundreds of GB/s of throughput with Amazon FSx for Luster, and exascale computing with Amazon EC2 UltraCluster provides the more efficient infrastructure for AI/ML, graphics and HPC. Combined with other managed services such as Amazon Bedrock, Amazon SageMaker, and Amazon Elastic Kubernetes Service (Amazon EKS), these instances provide developers with the industry’s best platform for building and deploying genetic AI, HPC, and graphics applications.
High-performance and cost-effective GPU-based instances for AI, HPC and graphics workloads
To power the development, training, and inference of the largest large language models (LLMs), EC2 P5e instances will feature NVIDIA’s latest H200 GPUs, which offer 141 GB of HBM3e GPU memory, which is 1.7 times larger and 1.4x faster than H100 GPUs. This boost in GPU memory along with up to 3200 Gbps of EFA networking enabled by the AWS Nitro System will allow you to continue building, training and deploying your cutting edge models on AWS.
EC2 G6e instances, featuring NVIDIA L40S GPUs, are built to provide developers with a widely available option for training and inferring publicly available LLMs, as well as to support the growing adoption of small language models (SLMs). They are also optimal for digital twin applications that use NVIDIA Omniverse to describe and simulate 3D tools and applications, as well as create virtual worlds and advanced workflows for industrial digitization.
The EC2 G6 instances, featuring NVIDIA L4 GPUs, will provide a low-cost, energy-efficient solution for developing ML models for natural language processing, language translation, video and image analysis, speech recognition and personalization as well as graphics workloads such as rendering and render real-time, cinema-quality graphics and game streaming.
About the Author
Chetan Kapoor is the Director of Product Management for the Amazon EC2 Accelerated Computing Portfolio.