Deploy a full stack voice AI agent with Amazon Nova Sonic

AI-powered speech solutions are transforming contact centers by enabling natural conversations between customers and AI agents, shortening wait times, and dramatically reducing operational costs—all without sacrificing the human-like interaction customers expect. With the recent launch of Amazon Nova Sonic in Amazon Bedrock, you can now build sophisticated conversational AI agents that communicate naturally through voice, … Read more

Manage multi-tenant Amazon Bedrock costs using application inference profiles

Successful generative AI software as a service (SaaS) systems require a balance between service scalability and cost management. This becomes critical when building a multi-tenant generative AI service designed to serve a large, diverse customer base while maintaining rigorous cost controls and comprehensive usage monitoring. Traditional cost management approaches for such systems often reveal limitations. … Read more

Evaluating generative AI models with Amazon Nova LLM-as-a-Judge on Amazon SageMaker AI

Evaluating the performance of large language models (LLMs) goes beyond statistical metrics like perplexity or bilingual evaluation understudy (BLEU) scores. For most real-world generative AI scenarios, it’s crucial to understand whether a model is producing better outputs than a baseline or an earlier iteration. This is especially important for applications such as summarization, content generation, … Read more

Building cost-effective RAG applications with Amazon Bedrock Knowledge Bases and Amazon S3 Vectors

Vector embeddings have become essential for modern Retrieval Augmented Generation (RAG) applications, but organizations face significant cost challenges as they scale. As knowledge bases grow and require more granular embeddings, many vector databases that rely on high-performance storage such as SSDs or in-memory solutions become prohibitively expensive. This cost barrier often forces organizations to limit … Read more

Implementing on-demand deployment with customized Amazon Nova models on Amazon Bedrock

Amazon Bedrock offers model customization capabilities for customers to tailor versions of foundation models (FMs) to their specific needs through features such as fine-tuning and distillation. Today, we’re announcing the launch of on-demand deployment for customized models ready to be deployed on Amazon Bedrock. On-demand deployment for customized models provides an additional deployment option that … Read more

Building enterprise-scale RAG applications with Amazon S3 Vectors and DeepSeek R1 on Amazon SageMaker AI

Organizations are adopting large language models (LLMs), such as DeepSeek R1, to transform business processes, enhance customer experiences, and drive innovation at unprecedented speed. However, standalone LLMs have key limitations such as hallucinations, outdated knowledge, and no access to proprietary data. Retrieval Augmented Generation (RAG) addresses these gaps by combining semantic search with generative AI, … Read more

Accenture scales video analysis with Amazon Nova and Amazon Bedrock Agents

This post was written with Ilan Geller, Kamal Mannar, Debasmita Ghosh, and Nakul Aggarwal of Accenture. Video highlights offer a powerful way to boost audience engagement and extend content value for content publishers. These short, high-impact clips capture key moments that drive viewer retention, amplify reach across social media, reinforce brand identity, and open new … Read more

Deploy conversational agents with Vonage and Amazon Nova Sonic

This post is co-written with Mark Berkeland, Oscar Rodriguez and Marina Gerzon from Vonage. Voice-based technologies are transforming the way businesses engage with customers across customer support, virtual assistants, and intelligent agents. However, creating real-time, expressive, and highly responsive voice interfaces still requires navigating a complex stack of communication protocols, AI models, and media infrastructure. … Read more

Enabling customers to deliver production-ready AI agents at scale

AI agents will change how we all work and live. Our AWS CEO, Matt Garman, shared a vision of a technological shift as transformative as the advent of the internet. I’m energized by this vision because I’ve witnessed firsthand how these intelligent agent systems are already beginning to solve complex problems, automate workflows, and create … Read more

Amazon Bedrock Knowledge Bases now supports Amazon OpenSearch Service Managed Cluster as vector store

Amazon Bedrock Knowledge Bases has extended its vector store options by enabling support for Amazon OpenSearch Service managed clusters, further strengthening its capabilities as a fully managed Retrieval Augmented Generation (RAG) solution. This enhancement builds on the core functionality of Amazon Bedrock Knowledge Bases , which is designed to seamlessly connect foundation models (FMs) with … Read more

How PayU built a secure enterprise AI assistant using Amazon Bedrock

This is a guest post co-written with Rahul Ghosh, Sandeep Kumar Veerlapati, Rahmat Khan, and Mudit Chopra from PayU. PayU offers a full-stack digital financial services system that serves the financial needs of merchants, banks, and consumers through technology. As a Central Bank-regulated financial institution in India, we recently observed a surge in our employees’ … Read more

Supercharge generative AI workflows with NVIDIA DGX Cloud on AWS and Amazon Bedrock Custom Model Import

This post is co-written with Andrew Liu, Chelsea Isaac, Zoey Zhang, and Charlie Huang from NVIDIA. DGX Cloud on Amazon Web Services (AWS) represents a significant leap forward in democratizing access to high-performance AI infrastructure. By combining NVIDIA GPU expertise with AWS scalable cloud services, organizations can accelerate their time-to-train, reduce operational complexity, and unlock … Read more

Accelerate generative AI inference with NVIDIA Dynamo and Amazon EKS

This post is co-written with Kshitiz Gupta, Wenhan Tan, Arun Raman, Jiahong Liu, and Eiluth Triana Isaza from NVIDIA. As large language models (LLMs) and generative AI applications become increasingly prevalent, the demand for efficient, scalable, and low-latency inference solutions has grown. Traditional inference systems often struggle to meet these demands, especially in distributed, multi-node … Read more

AWS doubles investment in AWS Generative AI Innovation Center, marking two years of customer success

When we launched the AWS Generative AI Innovation Center in 2023, we had one clear goal: help customers turn AI potential into real business value. We’ve already guided thousands of customers across industries from financial services to healthcare—including Formula 1, FOX, GovTech Singapore, Itaú Unibanco, Nasdaq, NFL, RyanAir, and S&P Global—from AI experimentation to full-scale … Read more

Build AI-driven policy creation for vehicle data collection and automation using Amazon Bedrock

Vehicle data is critical for original equipment manufacturers (OEMs) to drive continuous product innovation and performance improvements and to support new value-added services. Similarly, the increasing digitalization of vehicle architectures and adoption of software-configurable functions allow OEMs to add new features and capabilities efficiently. Sonatus’s Collector AI and Automator AI products address these two aspects … Read more

How Rapid7 automates vulnerability risk scores with ML pipelines using Amazon SageMaker AI

This post is cowritten with Jimmy Cancilla from Rapid7. Organizations are managing increasingly distributed systems, which span on-premises infrastructure, cloud services, and edge devices. As systems become interconnected and exchange data, the potential pathways for exploitation multiply, and vulnerability management becomes critical to managing risk. Vulnerability management (VM) is the process of identifying, classifying, prioritizing, … Read more

Build secure RAG applications with AWS serverless data lakes

Data is your generative AI differentiator, and successful generative AI implementation depends on a robust data strategy incorporating a comprehensive data governance approach. Traditional data architectures often struggle to meet the unique demands of generative such as applications. An effective generative AI data strategy requires several key components like seamless integration of diverse data sources, … Read more

Advanced fine-tuning methods on Amazon SageMaker AI

This post provides the theoretical foundation and practical insights needed to navigate the complexities of LLM development on Amazon SageMaker AI, helping organizations make optimal choices for their specific use cases, resource constraints, and business objectives. We also address the three fundamental aspects of LLM development: the core lifecycle stages, the spectrum of fine-tuning methodologies, … Read more

Streamline machine learning workflows with SkyPilot on Amazon SageMaker HyperPod

This post is co-written with Zhanghao Wu, co-creator of SkyPilot. The rapid advancement of generative AI and foundation models (FMs) has significantly increased computational resource requirements for machine learning (ML) workloads. Modern ML pipelines require efficient systems for distributing workloads across accelerated compute resources, while making sure developer productivity remains high. Organizations need infrastructure solutions … Read more

Intelligent document processing at scale with generative AI and Amazon Bedrock Data Automation

Extracting information from unstructured documents at scale is a recurring business task. Common use cases include creating product feature tables from descriptions, extracting metadata from documents, and analyzing legal contracts, customer reviews, news articles, and more. A classic approach to extracting information from text is named entity recognition (NER). NER identifies entities from predefined categories, … Read more

Build a conversational data assistant, Part 2 – Embedding generative business intelligence with Amazon Q in QuickSight

In Part 1 of this series, we explored how Amazon’s Worldwide Returns & ReCommerce (WWRR) organization built the Returns & ReCommerce Data Assist (RRDA)—a generative AI solution that transforms natural language questions into validated SQL queries using Amazon Bedrock Agents. Although this capability improves data access for technical users, the WWRR organization’s journey toward truly … Read more

Build a conversational data assistant, Part 1: Text-to-SQL with Amazon Bedrock Agents

What if you could replace hours of data analysis with a minute-long conversation? Large language models can transform how we bridge the gap between business questions and actionable data insights. For most organizations, this gap remains stubbornly wide, with business teams trapped in endless cycles—decoding metric definitions and hunting for the correct data sources to … Read more

Implement user-level access control for multi-tenant ML platforms on Amazon SageMaker AI

Managing access control in enterprise machine learning (ML) environments presents significant challenges, particularly when multiple teams share Amazon SageMaker AI resources within a single Amazon Web Services (AWS) account. Although Amazon SageMaker Studio provides user-level execution roles, this approach becomes unwieldy as organizations scale and team sizes grow. Refer to the Operating model whitepaper for … Read more

Long-running execution flows now supported in Amazon Bedrock Flows in public preview

Today, we announce the public preview of long-running execution (asynchronous) flow support within Amazon Bedrock Flows. With Amazon Bedrock Flows, you can link foundation models (FMs), Amazon Bedrock Prompt Management, Amazon Bedrock Agents, Amazon Bedrock Knowledge Bases, Amazon Bedrock Guardrails, and other AWS services together to build and scale predefined generative AI workflows. As customers … Read more

Fraud detection empowered by federated learning with the Flower framework on Amazon SageMaker AI

Fraud detection remains a significant challenge in the financial industry, requiring advanced machine learning (ML) techniques to detect fraudulent patterns while maintaining compliance with strict privacy regulations. Traditional ML models often rely on centralized data aggregation, which raises concerns about data security and regulatory constraints. Fraud cost businesses over $485.6 billion in 2023 alone, according … Read more

Building intelligent AI voice agents with Pipecat and Amazon Bedrock – Part 2

Voice AI is changing the way we use technology, allowing for more natural and intuitive conversations. Meanwhile, advanced AI agents can now understand complex questions and act autonomously on our behalf. In Part 1 of this series, you learned how you can use the combination of Amazon Bedrock and Pipecat, an open source framework for … Read more

Uphold ethical standards in fashion using multimodal toxicity detection with Amazon Bedrock Guardrails

The global fashion industry is estimated to be valued at $1.84 trillion in 2025, accounting for approximately 1.63% of the world’s GDP (Statista, 2025). With such massive amounts of generated capital, so too comes the enormous potential for toxic content and misuse. In the fashion industry, teams are frequently innovating quickly, often utilizing AI. Sharing … Read more

New capabilities in Amazon SageMaker AI continue to transform how organizations develop AI models

As AI models become increasingly sophisticated and specialized, the ability to quickly train and customize models can mean the difference between industry leadership and falling behind. That is why hundreds of thousands of customers use the fully managed infrastructure, tools, and workflows of Amazon SageMaker AI to scale and advance AI model development. Since launching … Read more

Accelerate foundation model development with one-click observability in Amazon SageMaker HyperPod

Amazon SageMaker HyperPod now provides a comprehensive, out-of-the-box dashboard that delivers insights into foundation model (FM) development tasks and cluster resources. This unified observability solution automatically publishes key metrics to Amazon Managed Service for Prometheus and visualizes them in Amazon Managed Grafana dashboards, optimized specifically for FM development with deep coverage of hardware health, resource … Read more