Huntington Bank: Redacting sensitive data from 400M+ documents with AWS

When your document repository contains hundreds of millions of files accumulated over nearly a decade, how do you systematically find and redact sensitive customer data without taking years to complete? This was the challenge facing The Huntington National Bank (Huntington), a top 10 bank in the United States. Redacting sensitive information at scale Since 2015, … Read more

AI-powered BI with Snowflake and Amazon Quick

One dashboard shows 42,000 active movie view counts while another shows 38,500. Your chat agent references a third number entirely. Data teams spend hours reconciling numbers instead of answering strategic questions, and trust in analytics erodes. This is a pattern that we see across many organizations. Teams spend more effort reconciling numbers than actually using … Read more

How Loka Built a Natural, Low-Latency Voice Agent with Amazon Nova 2 Sonic

Loka transformed customer voice interactions by building a conversational AI agent with Amazon Nova 2 Sonic that keeps customers engaged with natural, responsive experiences. Their AWS-based solution achieves high speech reasoning accuracy on Big Bench Audio while delivering significantly lower costs and faster response times than traditional voice AI pipelines. In this post, we demonstrate … Read more

Build a protein research copilot with Amazon Bedrock AgentCore

Protein researchers face a time-consuming challenge: manually searching through thousands of peptide sequences to find structurally similar candidates is slow, error-prone, and requires deep domain expertise to interpret results. Building a protein research copilot can transform how researchers search for structurally similar peptides across large datasets — enabling natural language queries, automated embedding generation, and … Read more

Shared infrastructure, isolated tenants: Pool model multi-tenancy with Amazon Bedrock AgentCore

Building multi-tenant AI applications presents new architectural challenges. You need complete tenant isolation between customers, different service tiers with different capabilities, granular cost tracking, and observability per tenant. Without these, you could risk exposing customer data, not providing appropriate quality of service to your customers or running up unforeseen costs. In this post, you will … Read more

Building pay-per-intelligence for AI agents: How Ampersend uses Amazon Bedrock AgentCore Payments

This post was co-written with Kevin Jones from Ampersend (Edge & Node) and Chethan Shriyan from the Amazon Bedrock AgentCore Payments team. Ampersend and Amazon Bedrock AgentCore Payments are addressing one of the hardest problems in agentic AI. How do autonomous agents pay for services without developers building bespoke billing integrations, credential management, and payment … Read more

Embed the world: Multimodal AI for searchable aerial imagery at scale

Turning a library of aerial imagery into a natural-language-searchable knowledge base is a problem that touches every industry that relies on geospatial data — insurance, real estate, government, infrastructure, and agriculture. The traditional path requires either manual tile-by-tile inspection or training a bespoke computer vision model for each new question. Multimodal embeddings, large language model … Read more

Running ComfyUI workflows on Amazon SageMaker AI processing jobs

With ComfyUI workflows on Amazon SageMaker AI processing jobs, you can automate content generation at scale. For enterprises, every delay or misstep in creating compelling multimedia assets can mean lost sales, faded brand relevance, or missed marketing deadlines. When a product launch deadline looms or a seasonal promotion needs urgent assets, waiting for designers to … Read more

Accelerate campaign workflow with insights from Adobe Marketing Agent for Amazon Quick

Amazon Quick and Adobe Marketing Agent help marketing teams access campaign insights within governed conversations in seconds. Marketers can ask questions about campaign performance, audiences, journeys, campaign conflicts, and content performance in natural language. Amazon Quick provides the chat experience and action orchestration. Adobe provides marketing-domain analysis to the approved data sources behind those questions. … Read more

Monitor and debug generative AI inference with SageMaker detailed metrics and Insights dashboard on CloudWatch

Monitoring and troubleshooting generative AI inference endpoints operating at scale is challenging. When your large language model (LLM) endpoint’s P99 latency spikes, you must determine in minutes whether the root cause is GPU memory pressure, a saturated KV cache, unbalanced traffic across Availability Zones, or an auto scaling policy that hasn’t triggered. The shift from … Read more

Amazon Bedrock AgentCore harness is now generally available: Go from idea to production-grade agent in minutes

A year ago, Simon Willison wrote one of the cleanest definitions of an agent that has stuck around: An LLM agent runs tools in a loop to achieve a goal. That definition stuck because it describes what every production agent actually does. Kiro, Amazon Q Developer, Quick Agents, Codex, Claude Code: under the hood, they … Read more

Amazon SageMaker AI Async Inference now supports inline request payloads

Today, we’re announcing inline payload support for Amazon SageMaker AI Async Inference. Customers can now send inference payloads directly in the request body of the InvokeEndpointAsync API, removing the need to upload input data to Amazon Simple Storage Service (Amazon S3) before each invocation. For payloads up to 128,000 bytes, this removes an entire network … Read more

Get back hours every day with autonomous agents in Amazon Quick

What if you came back from a full day of meetings and the busywork was already done? Stalled deals followed up on. Compliance changes summarized. Meeting prep written. Not because you multi-tasked, but because something was working in the background while you focused on other urgent priorities. Teams are already using Amazon Quick — an AI assistant that connects to your most-used apps … Read more

Context intelligence for your data and AI agents at scale

Agents are only as intelligent as the context they can reason over. Today, that context is scattered across data lakes, data warehouses, lakehouses, databases, and streams, and in institutional knowledge that has never been written down. You want to trust the decisions made by your AI agents, but that can’t happen until agents have context. Imagine what becomes possible … Read more

New in Amazon Bedrock AgentCore: Build agents with broader knowledge and continuous learning

The models powering today’s agents are remarkably capable. They can reason across complex problems, plan multi-step workflows, and generate nuanced responses. But most agents are operating well below that potential. The gap isn’t intelligence. It’s access to the right context and feedback. A customer service agent tasked with answering a question about your company’s refund … Read more

Safeguard your agentic AI applications with the Amazon Bedrock Guardrails InvokeGuardrailChecks API

Today, we’re announcing a new API with Amazon Bedrock Guardrails. With this API, you can apply individual safeguards, also referred to as safety checks, at any point in your agentic AI applications without creating guardrail resources. The new InvokeGuardrailChecks API gives you the flexibility to invoke supported safeguards at any turn in the agentic loop … Read more

Introducing container caching in Amazon SageMaker AI for faster model scaling

Today, we’re excited to announce container image caching for Amazon SageMaker AI inference, the next major advancement in our faster scaling optimization journey. This speeds up end-to-end latency by up to 2x for generative AI models during scale-out events. Over the years, Amazon SageMaker AI has continued to reduce latency across these scaling stages: detecting … Read more

Parallelize speculative decoding with P-EAGLE on Amazon SageMaker AI

As large language models (LLMs) grow in size and complexity, maximizing inference throughput while minimizing latency remains a critical challenge for enterprise production deployments. Speculative decoding is one effective strategy to address this, utilizing a lightweight draft model to guess future tokens which are then verified by the target LLM in a single forward pass. While state-of-the-art frameworks like Extrapolation Algorithm for Greater … Read more

Introducing Gemma 4 models on Amazon Bedrock

Today, we are announcing the availability of the Gemma 4 family on Amazon Bedrock. Built by Google DeepMind and released under the Apache 2.0 license, Gemma 4 is a family of open-weight models designed with a focus on intelligence-per-parameter across a broad range of deployment scenarios. The family includes three instruction-tuned variants: Gemma 4 31B, … Read more

Build context-rich research agents with Deep Agents and Bedrock AgentCore

A common challenge in AI-powered research workflows is depth versus context. If your agent reads ten web pages, its context window (the amount of text a large language model (LLM) can process at once) gets filled with raw content. If it also runs data analysis code, chart-generation logic competes with strategic reasoning for limited space. … Read more

Building Supercharger: How Rocket Close optimized title operations with agentic AI

Rocket Close is a Detroit-based title agency and appraisal management company within Rocket Companies that provides title insurance, property valuation, and settlement services. As demand for mortgages and loans grew, title operations became a bottleneck in the homebuying process. Time-intensive, state-specific title examinations, combined with manual research and fragmented systems, slowed throughput and made it … Read more

Build a meeting prep and follow-up assistant with Amazon Quick and Cisco Webex MCP servers

Amazon Quick and Cisco Webex MCP servers can turn meeting prep and follow-up into a single conversational workflow. Instead of switching between Webex meetings, Vidcast videos, transcripts, recordings, and message spaces, users ask one assistant to gather the context they need. This post shows how to build a custom meeting prep and follow-up assistant using … Read more

From PDFs to insights: Architecting an intelligent document processing pipeline with AWS generative AI services

Organizations process millions of documents daily, from insurance claims and invoices to legal contracts and medical records. While traditional optical character recognition (OCR) solutions extract text, they can’t understand context, relationships, or meaning embedded within complex documents. This limitation creates bottlenecks that require manual intervention, increasing processing time and costs while introducing potential errors. Amazon … Read more

Built from the inside out: How AWS Professional Services became a frontier team first

AWS Professional Services (AWS ProServe) compressed engagement timelines from months to days, not by adding artificial intelligence (AI) tools to an existing process, but by fundamentally rebuilding how we deliver from the inside out. The shift mirrors what my colleague Swami Sivasubramanian outlined in How Frontier Teams Are Reinventing AI-Native Development: real productivity gains come … Read more

Extract Data with On-demand and Batch Pipelines Dynamically

Many companies have large volumes of paper or electronic documents that contain untapped business intelligence. With the advancement of generative AI, various large language models can be used to accurately extract relevant data from these documents. This post demonstrates an intelligent document processing pipeline that consists of both on-demand inference and batch inference options on … Read more

Evaluate AI agents systematically with Agent-EvalKit

Teams building AI agents typically evaluate them the way they evaluate any other software: by checking whether the output matches expectations. But agents that autonomously choose tools and sequence operations across multiple sources produce behavior that output-level testing cannot fully characterize. An agent might deliver a well-structured, actionable response while hallucinating, fabricating facts because its … Read more

Spot trends faster, sort smarter: Unlocking Sparklines and Custom Sort in Amazon Quick

Amazon Quick Sight, the business intelligence capability of Amazon Quick, delivers a unified BI experience, from modern interactive dashboards and natural language querying to pixel-perfect reports, machine learning insights, and embedded analytics at scale. Amazon Quick brings together AI-powered agents for business insights, research, and automation in one integrated experience, helping teams work smarter and … Read more