Building a voice-driven AWS assistant with Amazon Nova Sonic

As cloud infrastructure becomes increasingly complex, the need for intuitive and efficient management interfaces has never been greater. Traditional command-line interfaces (CLI) and web consoles, while powerful, can create barriers to quick decision-making and operational efficiency. What if you could speak to your AWS infrastructure and get immediate, intelligent responses? In this post, we explore … Read more

How Harmonic Security improved their data-leakage detection system with low-latency fine-tuned models using Amazon SageMaker, Amazon Bedrock, and Amazon Nova Pro

This post was written with Bryan Woolgar-O’Neil, Jamie Cockrill and Adrian Cunliffe from Harmonic Security Organizations face increasing challenges protecting sensitive data while supporting third-party generative AI tools. Harmonic Security, a cybersecurity company, developed an AI governance and control layer that spots sensitive data in line as employees use AI, giving security teams the power … Read more

How Swisscom builds enterprise agentic AI for customer support and sales using Amazon Bedrock AgentCore

This post was written with Arun Sittampalam and Maxime Darcot from Swisscom. As we navigate the constantly shifting AI ecosystem, enterprises face challenges in translating AI’s potential into scalable, production-ready solutions. Swisscom, Switzerland’s leading telecommunications provider with an estimated $19B revenue (2025) and over $37B Market capitalization as of June 2025 exemplifies how organizations can … Read more

Scaling MLflow for enterprise AI: What’s New in SageMaker AI with MLflow

Today we’re announcing Amazon SageMaker AI with MLflow, now including a serverless capability that dynamically manages infrastructure provisioning, scaling, and operations for artificial intelligence and machine learning (AI/ML) development tasks. It scales resources up during intensive experimentation and down to zero when not in use, reducing operational overhead. It introduces enterprise-scale features including seamless access … Read more

Amazon Bedrock AgentCore Observability with Langfuse

The rise of artificial intelligence (AI) agents marks a change in software development and how applications make decisions and interact with users. While traditional systems follow predictable paths, AI agents engage in complex reasoning that remains hidden from view. This invisibility creates a challenge for organizations: how can they trust what they can’t see?  This … Read more

Implement automated smoke testing using Amazon Nova Act headless mode

Automated smoke testing using Amazon Nova Act headless mode helps development teams validate core functionality in continuous integration and continuous delivery (CI/CD) pipelines. Development teams often deploy code several times daily, so fast testing helps maintain application quality. Traditional end-to-end testing can take hours to complete, creating delays in your CI/CD pipeline. Smoke testing is … Read more

Real-world reasoning: How Amazon Nova Lite 2.0 handles complex customer support scenarios

Artificial intelligence (AI) reasoning capabilities determine whether models can handle complex, real-world tasks beyond simple pattern matching. With strong reasoning, models can identify problems from ambiguous descriptions, apply policies under competing constraints, adapt tone to sensitive situations, and provide complete solutions that address root causes. Without robust reasoning, AI systems fail when faced with nuanced … Read more

Create AI-powered chat assistants for your enterprise with Amazon Quick Suite

Teams need instant access to enterprise data and intelligent guidance on how to use it. Instead, they get scattered information across multiple systems. This results in employees spending valuable time searching for answers instead of making decisions. In this post, we show how to build chat agents in Amazon Quick Suite to address this problem. … Read more

How AWS delivers generative AI to the public sector in weeks, not years

When critical services depend on quick action, from the safety of vulnerable children to environmental protection, you need working AI solutions in weeks, not years. Amazon recently announced an investment of up to $50 billion in expanded AI and supercomputing infrastructure for US government agencies, demonstrating both the urgency and commitment from Amazon Web Services … Read more

S&P Global Data integration expands Amazon Quick Research capabilities

Today, we are pleased to announce a new integration between Amazon Quick Research and S&P Global. This integration brings both S&P Global Energy news, research, and insights and S&P Global Market Intelligence data to Quick Research customers in one deep research agent. The S&P Global integration extends the capabilities of Quick Research so that business … Read more

Streamline AI agent tool interactions: Connect API Gateway to AgentCore Gateway with MCP

AgentCore Gateway now supports API GatewayAs organizations explore the possibilities of agentic applications, they continue to navigate challenges of using enterprise data as context in invocation requests to large language models (LLMs) in a manner that is secure and aligned with enterprise policies. To help standardize and secure those interactions, many organizations are using the … Read more

Create an intelligent insurance underwriter agent powered by Amazon Nova 2 Lite and Amazon Quick Suite

Insurance underwriting requires analyzing multiple data sources, evaluating risks, and making decisions that meet regulatory requirements. The underwriters face three core challenges: Siloed data scattered across Customer Relationship Management (CRM) systems, document repositories, and transactional databases Regulatory requirements for explainable, auditable decisions that traditional black box AI can’t satisfy The need for consistent, automated underwriting … Read more

How Myriad Genetics achieved fast, accurate, and cost-efficient document processing using the AWS open-source Generative AI Intelligent Document Processing Accelerator

This post was written with Martyna Shallenberg and Brode Mccrady from Myriad Genetics. Healthcare organizations face challenges in processing and managing high volumes of complex medical documentation while maintaining quality in patient care. These organizations need solutions to process documents effectively to meet growing demands. Myriad Genetics, a provider of genetic testing and precision medicine solutions … Read more

How CBRE powers unified property management search and digital assistant using Amazon Bedrock

This post was written with Lokesha Thimmegowda, Muppirala Venkata Krishna Kumar, and Maraka Vishwadev of CBRE. CBRE is the world’s largest commercial real estate services and investment firm. The company serves clients in more than 100 countries and offers services ranging from capital markets and leasing advisory to investment management, project management and facilities management. … Read more

Managed Tiered KV Cache and Intelligent Routing for Amazon SageMaker HyperPod

Modern AI applications demand fast, cost-effective responses from large language models, especially when handling long documents or extended conversations. However, LLM inference can become prohibitively slow and expensive as context length increases, with latency growing exponentially and costs mounting with each interaction. LLM inference requires recalculating attention mechanisms for the previous tokens when generating each … Read more

Apply fine-grained access control with Bedrock AgentCore Gateway interceptors

As enterprises rapidly adopt AI agents to automate workflows and enhance productivity, they face a critical scaling challenge: managing secure access to thousands of tools across their organization. Modern AI deployments no longer involve a handful of agents calling a few APIs—instead, enterprises are building unified AI platforms where hundreds of agents, consumer AI applications, … Read more

How Condé Nast accelerated contract processing and rights analysis with Amazon Bedrock

This post is co-written with Bob Boiko, Christopher Donnellan, and Sarat Tatavarthi from Condé Nast. For over a century, Condé Nast has stood at the forefront of global media, shaping culture and conversation through its prestigious portfolio of brands. Founded in 1909, the company has evolved from a traditional publisher into a modern media powerhouse. … Read more

Building AI-Powered Voice Applications: Amazon Nova Sonic Telephony Integration Guide

Organizations are increasingly seeking to enhance customer experiences through natural, responsive voice interactions across their telephony systems. Amazon Nova Sonic addresses this need as a speech-to-speech generative AI model that delivers real-time voice conversations with low latency and natural turn-taking. It understands speech across different accents and speaking styles, responds with expressive voices in multiple … Read more

University of California Los Angeles delivers an immersive theater experience with AWS generative AI services

This post was co-written with Andrew Browning, Anthony Doolan, Jerome Ronquillo, Jeff Burke, Chiheb Boussema, and Naisha Agarwal from UCLA. The University of California, Los Angeles (UCLA) is home to 16 Nobel Laureates and has been ranked the #1 public university in the United States for 8 consecutive years. The Office of Advanced Research Computing … Read more

Optimizing Mobileye’s REM™ with AWS Graviton: A focus on ML inference and Triton integration

This post is written by Chaim Rand, Principal Engineer, Pini Reisman, Software Senior Principal Engineer, and Eliyah Weinberg, Performance and Technology Innovation Engineer, at Mobileye. The Mobileye team would like to thank Sunita Nadampalli and Guy Almog from AWS for their contributions to this solution and this post. Mobileye is driving the global evolution toward … Read more

Evaluate models with the Amazon Nova evaluation container using Amazon SageMaker AI

This blog post introduces the new Amazon Nova model evaluation features in Amazon SageMaker AI. This release adds custom metrics support, LLM-based preference testing, log probability capture, metadata analysis, and multi-node scaling for large evaluations. The new features include: Custom metrics use the bring your own metrics (BYOM) functions to control evaluation criteria for your … Read more

Beyond the technology: Workforce changes for AI

Workplaces are increasingly integrating AI tools into daily operations, with AI assistants supporting teams, predictive analytics informing strategies, and automation streamlining workflows. AI has moved from experimental technology to standard business practice, changing how work gets done. Organizations need to understand what AI can do and how it affects their workforce to implement it successfully. … Read more

Enhanced performance for Amazon Bedrock Custom Model Import

You can now achieve significant performance improvements when using Amazon Bedrock Custom Model Import, with reduced end-to-end latency, faster time-to-first-token, and improved throughput through advanced PyTorch compilation and CUDA graph optimizations. With Amazon Bedrock Custom Model Import you can to bring your own foundation models to Amazon Bedrock for deployment and inference at scale. These … Read more

Amazon SageMaker AI introduces EAGLE based adaptive speculative decoding to accelerate generative AI inference

Generative AI models continue to expand in scale and capability, increasing the demand for faster and more efficient inference. Applications need low latency and consistent performance without compromising output quality. Amazon SageMaker AI introduces new enhancements to its inference optimization toolkit that bring EAGLE based adaptive speculative decoding to more model architectures. These updates make … Read more

Train custom computer vision defect detection model using Amazon SageMaker

On October 10, 2024, Amazon announced the discontinuation of the Amazon Lookout for Vision service, with a scheduled shut down date of October 31, 2025 (see Exploring alternatives and seamlessly migrating data from Amazon Lookout for Vision blog post). As part of our transition guidance for customers, we recommend the use of Amazon SageMaker AI tools … Read more

Practical implementation considerations to close the AI value gap

Artificial Intelligence (AI) is changing how businesses operate. Gartner® predicts at least 15% of day-to-day work decisions will be made autonomously through agentic AI by 2028. And 92% of companies are boosting their AI spending, according to McKinsey. But here’s the problem: most companies are yet to realize a positive impact of AI on their … Read more

Introducing bidirectional streaming for real-time inference on Amazon SageMaker AI

In 2025, generative AI has evolved from text generation to multi-modal use cases ranging from audio transcription and translation to voice agents that require real-time data streaming. Today’s applications demand something more: continuous, real-time dialogue between users and models—the ability for data to flow both ways, simultaneously, over a single persistent connection. Imagine a speech … Read more

Warner Bros. Discovery achieves 60% cost savings and faster ML inference with AWS Graviton

This post is written by Nukul Sharma, Machine Learning Engineering Manager, and Karthik Dasani, Staff Machine Learning Engineer, at Warner Bros. Discovery. Warner Bros. Discovery (WBD) is a leading global media and entertainment company that creates and distributes the world’s most differentiated and complete portfolio of content and brands across television, film and streaming. With iconic … Read more

Physical AI in practice: Technical foundations that fuel human-machine interactions

In our previous post, Transforming the physical world with AI: the next frontier in intelligent automation, we explored how the field of physical AI is redefining a wide range of industries including construction, manufacturing, healthcare, and agriculture. Now, we turn our attention to the complete development lifecycle behind this technology – the process of creating intelligent … Read more

HyperPod now supports Multi-Instance GPU to maximize GPU utilization for generative AI tasks

We are excited to announce the general availability of GPU partitioning with Amazon SageMaker HyperPod, using NVIDIA Multi-Instance GPU (MIG). With this capability you can run multiple tasks concurrently on a single GPU, minimizing wasted compute and memory resources that result from dedicating entire hardware (for example, entire GPUs) to tasks that can under-utilize the resources. By … Read more