How can we help you?

re:Invent Recap 2024

by Steven Tan, Senior Hyperscale Platforms Engineer at AC3

Another year, another re:Invent with lots of announcements to satisfy the needs of the growing industry. This year seems to have been focused on improving the Generative Artificial Intelligence (GenAI) capabilities within AWS. Despite this being the core focus, AWS did not fail to deliver outside of the GenAI space as well, announcing a multitude of service and feature improvements / announcements across the board.

Generative AI

AWS have been continuously innovating in the GenAI space in the past year, listening to customer feedback on how they can improve the developer experience when building GenAI applications. In response, AWS have revamped Amazon SageMaker to provide a single unified interface and experience.

Amazon SageMaker Unified Studio

As part of this year’s re:Invent, AWS introduced to us multiple new capabilities with the announcement of the next generation of Amazon SageMaker called Amazon SageMaker Unified Studio. The next generation of SageMaker is intended to act as a unified platform for all things data, analytics and AI. It brings together various tools into the single umbrella to provide a seamless experience for engineers intending on building AI capabilities into services and offerings.

Amazon SageMaker Lakehouse

In addition to the SageMaker Unified Studio, Amazon SageMaker Lakehouse was also announced. This helps data engineers unify data across Amazon S3 data lakes, Amazon Redshift data warehouses as well as third-party data sources.

Amazon Bedrock Model Distillation

Amazon Bedrock Model Distillation allows you to use smaller, faster and more cost effective models that deliver use-case specific accuracy that is comparable the largest and most capable in Amazon Bedrock. Model Distillation is the process of transferring knowledge from a more capable model to a smaller model, to produce a faster and more cost effective model that is as performant as the original targeted towards a specific use case. Distilled models in Amazon Bedrock are up to 500% faster and up to 75% less expensive compared to their original models, with less than 2% accuracy loss for RAG use cases.

Amazon Bedrock Multi-Agent Collaboration

Multi-agent collaboration within Amazon Bedrock allows you to build, deploy and manage multiple AI agents to work together on complex multi-step tasks that require specialised skills. When complex tasks need to be handled, additional specialised agents are required to handle the different aspects of the process. Amazon Bedrock now offers a managed solution to manage the collaboration, communication and task delegation behind the scenes to get agents to work together, achieving higher task success rates, accuracy and productivity.

Amazon Nova

Amazon Nova is a new generation of foundation models built by Amazon on the Inferentia and Trainium chips with the ability to process text, image and video as prompts. Users are able to take advantage of these features to understand videos, charts, documents or generate videos and other multimedia content.

Amazon Nova offers their new generation of models in 2 different flavours:

  • Understanding models These models accept text, image or video inputs to generate text output
  • Creative content generative models These models accept text and image inputs to generate image or video output with precise control over style and content

These models are available on Amazon Bedrock as:

Image 1: Amazon Bedrock

As a quick test to show it’s capabilities, I uploaded the photo in the “EC2 I8G Instances” section of this blog post and asked Bedrock to describe it to me.

Image 2: EC2 I8G Instances

Compute

In the compute space, AWS have introduced new EC2 instance types to assist with various workload types between machine learning and storage.

Trn2 Instances and Trn2 UltraServers

The new EC2 Trn2 instances and UltraServers are the newest generation of Tranium instance-types offered by AWS, powered by the AWS Trainium2 chips which are currently the most powerful compute options for ML and training. Compared to its previous generation (Trn1), the Trainium2 chips are 4x faster, have 4x more memory bandwidth and 3x more memory capacity. The Trn2 UltraServers are a completely new compute offering, allowing you to scale up to 64 Trainium2 chips connected with a high-bandwidth and low-latency NeuronLink interconnect.

EC2 I8g Instances

The I8g instances are a new storage optimised instance type, taking advantage of the third generation of AWS Nitro SSDs and AWS Graviton4 processors. This new instance type offers up to 96vCPUs, 768GiB of memory and 22.5TB of NVME storage and delivering up to 60% better compute performance and 2x larger caches compared to the I4g instance type. The I8g instances are well-suited for IO intensive workloads that require low latency access to data.

Image 3: Steven with a Graviton4

EC2 P5en Instances

P5en EC2 instance types are powered by the NVIDIA H200 Tensor Core GPUs and a custom 4th generation Intel Xeon scalable processor with an all-core turbo frequency of 3.2 GHz, utilising the latest Elastic Fabric Adapter (EFAv3) using Nitro V5. These processors offer 50% higher memory bandwidth and up to 4 times throughput between the CPU and GPU with PCIe Gen5, also showing up to 35% improvement in latency compared to the P5 instances that uses the previous generation of EFA and Nitro. P5en instances also increase local storage performance by up to 2 times and Amazon EBS bandwidth by up to 25% compared with the P5 instances. These are well positioned to improve latency for model training, fine-tuning and running inference for complex large language models and multimodal foundational models as well as memory-intensive HPC applications.

Database

Amazon Aurora DSQL

Amazon Aurora DSQL is a new serverless, active-active multi-region distributed database service that is PostgreSQL compatible. Aurora DSQL was built from the ground-up designed to be run in either a single-region or multi-region configuration, splitting apart the various roles of a database service into individual micro-services that can scale up and down independently and on demand. Due to the serverless design of Aurora DSQL, you can eliminate the operational burden of performing patching, upgrades, scheduling maintenance downtime and service fail-overs during scaling events.

In a single-region configuration, Aurora DSQL commits all write transactions to a distributed transaction log in 3 availability zones providing 99.99% availability for your database. In a multi-region configuration, Aurora DSQL commits all write transactions to the distributed transaction log within your local region, and then synchronously replicates the write to the other linked regions, providing strongly consistent reads and writes from any linked cluster. A third region acts as a witness region, which receives the data written to the linked clusters to provide multi-region durability and availability. In a multi-region configuration, DSQL is designed for 99.999% availability.

Amazon MemoryDB Multi-Region

AWS announced the general availability of the Amazon MemoryDB Multi-Region capabilities, allowing developers to take advantage of an active-active, multi-region database with 99.999% availability, microsecond read and single-digit millisecond write latencies across multiple AWS regions. MemoryDB Multi-Region is currently available for Valkey, an open-source drop-in replacement for Redis. Storage

In the storage space, a lot of changes were announced, mostly related to S3 Buckets and the new abilities that extend the way you can interact with objects.

Queryable Object Metadata for Amazon S3 Buckets

Amazon S3 buckets allows users to store objects at a large scale, but finding these objects may prove challenging after a certain point. To assist with being able to find objects by it’s metadata, AWS have introduced queryable object metadata. This allows the use of Iceberg compatible tools such as Amazon Athena, Amazon Redshift, Amazon QuickSight and Apache Spark to easily and efficiently query the metadata of objects at any scale, helping you find the data that you need.

The metadata schema contains elements including but not exclusively the bucket name, object key, creation/modification time, storage class, encryption status, tags, and user metadata. Additional application specific information in a separate table and joined with the metadata table as part of your query.

Amazon S3 Tables

Amazon S3 Tables is a new type of S3 bucket offered by the service, alongside of the general purpose and directory buckets. S3 Tables is is optimised to store tabular data in the Apache iceberg format, allowing for queries to be made by query engines like Amazon Athena, Amazon EMR and Apache Spark. This new bucket types allows you to treat the bucket like an analytics warehouse that can store iceberg tables with various schemas, with the same durability, availability, scalability and performance as S3 itself.

Similarly to the standard S3 bucket characteristics you are already used to, each table bucket resides in a specific AWS region, are referenced by a ARN and allows configuration of a resource policy. The table bucket must have a unique name within the AWS account region it is deployed in, and uses namespaces to logically group tables within the bucket.

AWS Data Transfer Terminals

AWS Data Transfer Terminals a new way to transfer data into AWS via a physical location. It aims to provide a secure, upload-ready physical location where you can bring your data, and upload it to any AWS endpoint including but not limited to Amazon S3 and Amazon EFS. This allows you to bring your data to the facility and quickly transfer your data to AWS without waiting for an AWS snow device or being limited by your internet bandwidth.

The pricing model only measures the time that a port is provisioned for use within the AWS Data Transfer Terminal regardless of whether data is passing through or not. There is no per-GB charge for data transferred via the terminal.

Storage Browser for Amazon S3

Storage Browser for Amazon S3 allows websites to build applications with the ability to interact with files stored in S3 via a pre-built AWS Amplify UI React component. Developers using React or a React-based framework will be able to use the component to allow users to browser, read, write and delete files inside of an S3 bucket.

Conclusion

Similarly to last year, this re:Invent had a huge focus on GenAI bringing us a lot of exciting new features and capabilities to experiment and build with. I look forward to building with some of these new capabilities and blogging about them in the new year!