From developer desks to the whole organization: Running Claude Cowork in Amazon Bedrock

Today, we’re excited to announce Claude Cowork in Amazon Bedrock. You can now run Cowork and Claude Code Desktop through Amazon Bedrock, directly or using an LLM gateway. From startups to global enterprises across every industry, organizations build with Claude Code in Amazon Bedrock to boost developer productivity and accelerate delivery. With Amazon Bedrock you … Read more

End-to-end lineage with DVC and Amazon SageMaker AI MLflow apps

Production machine learning (ML) teams struggle to trace the full lineage of a model through the data and the code that trained it, the exact dataset version it consumed, and the experiment metrics that justified its deployment. Without this traceability, questions like “which data trained the model currently in production?” or “can we reproduce the … Read more

Accelerate Generative AI Inference on Amazon SageMaker AI with G7e Instances

As the demand for generative AI continues to grow, developers and enterprises seek more flexible, cost-effective, and powerful accelerators to meet their needs. Today, we are thrilled to announce the availability of G7e instances powered by NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs on Amazon SageMaker AI. You can provision nodes with 1, 2, … Read more

ToolSimulator: scalable tool testing for AI agents

You can use ToolSimulator, an LLM-powered tool simulation framework within Strands Evals, to thoroughly and safely test AI agents that rely on external tools, at scale. Instead of risking live API calls that expose personally identifiable information (PII), trigger unintended actions, or settling for static mocks that break with multi-turn workflows, you can use ToolSimulator’s large … Read more

Omnichannel ordering with Amazon Bedrock AgentCore and Amazon Nova 2 Sonic

Introduction Building a voice-enabled ordering system that works across mobile apps, websites, and voice interfaces (an omnichannel approach) presents real challenges. You need to process bidirectional audio streams, maintain conversation context across multiple turns, integrate backend services without tight coupling, and scale to handle peak traffic. In this post, we’ll show you how to build … Read more

Introducing granular cost attribution for Amazon Bedrock

As AI inference grows into a significant share of cloud spend, understanding who and what are driving costs is essential for chargebacks, cost optimization, and financial planning. Today, we’re announcing granular cost attribution for Amazon Bedrock inference. Amazon Bedrock now automatically attributes inference costs to the IAM principal that made the call. An IAM principal … Read more

Optimize video semantic search intent with Amazon Nova Model Distillation on Amazon Bedrock

Optimizing models for video semantic search requires balancing accuracy, cost, and latency. Faster, smaller models lack routing intelligence, while larger, accurate models add significant latency overhead. In Part 1 of this series, we showed how to build a multimodal video semantic search system on AWS with intelligent intent routing using the Anthropic Claude Haiku model … Read more

Power video semantic search with Amazon Nova Multimodal Embeddings

Video semantic search is unlocking new value across industries. The demand for video-first experiences is reshaping how organizations deliver content, and customers expect fast, accurate access to specific moments within video. For example, sports broadcasters need to surface the exact moment a player scored to deliver highlight clips to fans instantly. Studios need to find … Read more

Nova Forge SDK series part 2: Practical guide to fine-tune Nova models using data mixing capabilities

This hands-on guide walks through every step of fine-tuning an Amazon Nova model with the Amazon Nova Forge SDK, from data preparation to training with data mixing to evaluation, giving you a repeatable playbook you can adapt to your own use case. This is the second part in our Nova Forge SDK series, building on … Read more

From hours to minutes: How Agentic AI gave marketers time back for what matters

Your marketing team loses hours to page assembly, coordination emails, and review cycles. These manual workflows keep teams from their most important work: identifying what problems customers face, crafting messages that resonate, and building campaigns that drive meaningful engagement. In this post, we share how AWS Marketing’s Technology, AI, and Analytics (TAA) team worked with … Read more

Cost-efficient custom text-to-SQL using Amazon Nova Micro and Amazon Bedrock on-demand inference

Text-to-SQL generation remains a persistent challenge in enterprise AI applications, particularly when working with custom SQL dialects or domain-specific database schemas. While foundation models (FMs) demonstrate strong performance on standard SQL, achieving production-grade accuracy for specialized dialects requires fine-tuning. However, fine-tuning introduces an operational trade-off: hosting custom models on persistent infrastructure incurs continuous costs, even during … Read more

Transform retail with AWS generative AI services

Online retailers face a persistent challenge: shoppers struggle to determine the fit and look when ordering online, leading to increased returns and decreased purchase confidence. The cost? Lost revenue, operational overhead, and customer frustration. Meanwhile, consumers increasingly expect immersive, interactive shopping experiences that bridge the gap between online and in-store retail. Retailers implementing virtual try-on … Read more

How Automated Reasoning checks in Amazon Bedrock transform generative AI compliance

Compliance teams in regulated industries spend weeks on manual reviews, pay for outside consultants, and still face audit gaps when AI outputs lack formal proof. Automated Reasoning checks in Amazon Bedrock Guardrails address this by replacing probabilistic AI validation with mathematical verification, turning AI-generated decisions into provably correct, auditable results. In this post, you’ll learn … Read more

Create rich, custom tooltips in Amazon Quick Sight

Amazon Quick Sight, the business intelligence (BI) capability of Amazon Quick, is a unified BI service. It provides modern interactive dashboards, natural language querying, pixel-perfect reports, machine learning (ML) insights, and embedded analytics at scale. Amazon Quick brings together AI agents for business insights, research, and automation in one integrated experience, helping you work smarter … Read more

Accelerating decode-heavy LLM inference with speculative decoding on AWS Trainium and vLLM

Practical benchmarks showing faster inter-token latency when deploying Qwen3 models with vLLM, Kubernetes, and AWS AI Chips. Speculative decoding on AWS Trainium can accelerate token generation by up to 3x for decode-heavy workloads, helping reduce the cost per output token and improving throughput without sacrificing output quality. If you build AI writing assistants, coding agents, … Read more

Rede Mater Dei de Saúde: Monitoring AI agents in the revenue cycle with Amazon Bedrock AgentCore

This post is cowritten by Renata Salvador Grande, Gabriel Bueno and Paulo Laurentys at Rede Mater Dei de Saúde. The growing adoption of multi-agent AI systems is redefining critical operations in healthcare. In large hospital networks, where thousands of decisions directly impact cash flow, service delivery times, and the risk of claim denials, the ability … Read more

Navigating the generative AI journey: The Path-to-Value framework from AWS

Generative AI is reshaping how organizations approach productivity, customer experiences, and operational capabilities. Across industries, teams are experimenting with generative AI to unlock new ways of working. Many of these efforts produce compelling proofs of concept (POC) that demonstrate technical feasibility. The real challenge begins after those early wins. Although POCs frequently demonstrate technical feasibility, … Read more

Use-case based deployments on SageMaker JumpStart

Amazon SageMaker JumpStart provides pretrained models for a wide range of problem types to help you get started with AI workloads. SageMaker JumpStart offers access to solutions for top use cases that can be deployed to SageMaker AI Managed Inference endpoints or SageMaker HyperPod clusters. Through pre-set deployment options, customers can quickly move from model … Read more

Best practices to run inference on Amazon SageMaker HyperPod

Deploying and scaling foundation models for generative AI inference presents challenges for organizations. Teams often struggle with complex infrastructure setup, unpredictable traffic patterns that lead to over-provisioning or performance bottlenecks, and the operational overhead of managing GPU resources efficiently. These pain points result in delayed time-to-market, suboptimal model performance, and inflated costs that can make … Read more

How Guidesly built AI-generated trip reports for outdoor guides on AWS

This is guest post by David Lord, Taylor Lord, Shiva Prasad, Anup Banasavalli Hiriyanagowda, Nikhil Chandra from Guidesly. Guidesly is reshaping how outdoor recreation is booked, run, and experienced. Founded in 2019, it began as a way to connect anglers, hunters, divers, and outdoor recreation enthusiasts with trusted guides, dive shops, and charters. It has … Read more

Spring AI SDK for Amazon Bedrock AgentCore is now Generally Available

Agentic AI is transforming how organizations use generative AI, moving beyond prompt-response interactions to autonomous systems that can plan, execute, and complete complex multi-step tasks. While early proof of concepts in Agentic AI spaces excite business stakeholders, scaling them to production requires addressing scalability, governance, and security challenges. Amazon Bedrock AgentCore is an Agentic AI … Read more

How to build effective reward functions with AWS Lambda for Amazon Nova model customization

Building effective reward functions can help you customize Amazon Nova models to your specific needs, with AWS Lambda providing the scalable, cost-effective foundation. Lambda’s serverless architecture lets you focus on defining quality criteria while it handles the computational infrastructure. Amazon Nova offers multiple customization approaches, with Reinforcement fine-tuning (RFT) standing out for its ability to teach … Read more

Understanding Amazon Bedrock model lifecycle

Amazon Bedrock regularly releases new foundation model (FM) versions with better capabilities, accuracy, and safety. Understanding the model lifecycle is essential for effective planning and management of AI applications built on Amazon Bedrock. Before migrating your applications, you can test these models through the Amazon Bedrock console or API to evaluate their performance and compatibility. … Read more

The future of managing agents at scale: AWS Agent Registry now in preview

Now available through Amazon Bedrock AgentCore, use AWS Agent Registry to discover, share, and reuse agents, tools, and agent skills across your organization. As enterprises scale to hundreds or thousands of agents, platform teams face three critical challenges: visibility (knowing what agents exist across the organization), control (governing who can publish and what becomes discoverable … Read more

Embed a live AI browser agent in your React app with Amazon Bedrock AgentCore

When you build AI-powered applications, your users must understand and trust AI agents that navigate websites and interact with web content on their behalf. When an agent interacts with web content autonomously, your users require visibility into those actions to maintain confidence and control, which they don’t currently have. The Amazon Bedrock AgentCore Browser BrowserLiveView … Read more

Introducing stateful MCP client capabilities on Amazon Bedrock AgentCore Runtime

Stateful MCP client capabilities on Amazon Bedrock AgentCore Runtime now enable interactive, multi-turn agent workflows that were previously impossible with stateless implementations. Developers building AI agents often struggle when their workflows must pause mid-execution to ask users for clarification, request large language model (LLM)-generated content, or provide real-time progress updates during long-running operations, stateless MCP … Read more

Customize Amazon Nova models with Amazon Bedrock fine-tuning

Today, we’re sharing how Amazon Bedrock makes it straightforward to customize Amazon Nova models for your specific business needs. As customers scale their AI deployments, they need models that reflect proprietary knowledge and workflows — whether that means maintaining a consistent brand voice in customer communications, handling complex industry-specific workflows or accurately classifying intents in … Read more

Human-in-the-loop constructs for agentic workflows in healthcare and life sciences

In healthcare and life sciences, AI agents help organizations process clinical data, submit regulatory filings, automate medical coding, and accelerate drug development and commercialization. However, the sensitive nature of healthcare data and regulatory requirements like Good Practice (GxP) compliance require human oversight at key decision points. This is where human-in-the-loop (HITL) constructs become essential. In … Read more

Building intelligent audio search with Amazon Nova Embeddings: A deep dive into semantic audio understanding

If you’re looking to enhance your content understanding and search capabilities, audio embeddings offer a powerful solution. In this post, you’ll learn how to use Amazon Nova Multimodal Embeddings to transform your audio content to searchable, intelligent data that captures acoustic features like tone, emotion, musical characteristics, and environmental sounds. Finding specific content in these … Read more

Reinforcement fine-tuning on Amazon Bedrock: Best practices

You can use reinforcement Fine-Tuning (RFT) in Amazon Bedrock to customize Amazon Nova and supported open source models by defining what “good” looks like—no large labeled datasets required. By learning from reward signals rather than static examples, RFT delivers up to 66% accuracy gains over base models at reduced customization cost and complexity. This post … Read more