Build an intelligent photo search using Amazon Rekognition, Amazon Neptune, and Amazon Bedrock

Managing large photo collections presents significant challenges for organizations and individuals. Traditional approaches rely on manual tagging, basic metadata, and folder-based organization, which can become impractical when dealing with thousands of images containing multiple people and complex relationships. Intelligent photo search systems address these challenges by combining computer vision, graph databases, and natural language processing … Read more

Build an intelligent photo search using Amazon Rekognition, Amazon Neptune, and Amazon Bedrock

Managing large photo collections presents significant challenges for organizations and individuals. Traditional approaches rely on manual tagging, basic metadata, and folder-based organization, which can become impractical when dealing with thousands of images containing multiple people and complex relationships. Intelligent photo search systems address these challenges by combining computer vision, graph databases, and natural language processing … Read more

Train CodeFu-7B with veRL and Ray on Amazon SageMaker Training jobs

The rapid advancement of artificial intelligence (AI) has created unprecedented demand for specialized models capable of complex reasoning tasks, particularly in competitive programming where models must generate functional code through algorithmic reasoning rather than pattern memorization. Reinforcement learning (RL) enables models to learn through trial and error by receiving rewards based on actual code execution, … Read more

Generate structured output from LLMs with Dottxt Outlines in AWS

This post is cowritten with Remi Louf, CEO and technical founder of Dottxt. Structured output in AI applications refers to AI-generated responses conforming to formats that are predefined, validated, and often strictly entered. This can include the schema for the output, or ways specific fields in the output should be mapped. Structured outputs are essential … Read more

Global cross-Region inference for latest Anthropic Claude Opus, Sonnet and Haiku models on Amazon Bedrock in Thailand, Malaysia, Singapore, Indonesia, and Taiwan

Organizations across in Thailand, Malaysia, Singapore, Indonesia, and Taiwan can now access Anthropic Claude Opus 4.6, Sonnet 4.6, and Claude Haiku 4.5 through Global cross-Region inference (CRIS) on Amazon Bedrock—delivering foundation models through a globally distributed inference architecture designed for scale. Global CRIS offers three key advantages: higher quotas, cost efficiency, and intelligent request routing … Read more

Introducing Amazon Bedrock global cross-Region inference for Anthropic’s Claude models in the Middle East Regions (UAE and Bahrain)

We’re excited to announce the availability of Anthropic’s Claude Opus 4.6, Claude Sonnet 4.6, Claude Opus 4.5, Claude Sonnet 4.5, and Claude Haiku 4.5 through Amazon Bedrock global cross-Region inference for customers operating in the Middle East. This launch supports organizations in the Middle East to access Anthropic’s latest Claude models on Amazon Bedrock while … Read more

Scaling data annotation using vision-language models to power physical AI systems

Critical labor shortages are constraining growth across manufacturing, logistics, construction, and agriculture. The problem is particularly acute in construction: nearly 500,000 positions remain unfilled in the United States, with 40% of the current workforce approaching retirement within the decade. These workforce limitations result in delayed projects, escalating costs, and deferred development plans. To address these … Read more

How Sonrai uses Amazon SageMaker AI to accelerate precision medicine trials

In precision medicine, researchers developing diagnostic tests for early disease detection face a critical challenge: datasets containing thousands of potential biomarkers but only hundreds of patient samples. This curse of dimensionality can determine the success or failure of breakthrough discoveries. Modern bioinformatics use multiple omic modalities—genomics, lipidomics, proteomics, and metabolomics—to develop early disease detection tests. … Read more

Accelerating AI model production at Hexagon with Amazon SageMaker HyperPod

This blog post was co-authored with Johannes Maunz, Tobias Bösch Borgards, Aleksander Cisłak, and Bartłomiej Gralewicz from Hexagon. Hexagon is the global leader in measurement technologies and provides the confidence that vital industries rely on to build, navigate, and innovate. From microns to Mars, Hexagon’s solutions drive productivity, quality, safety, and sustainability across aerospace, agriculture, … Read more

Agentic AI with multi-model framework using Hugging Face smolagents on AWS

This post is cowritten by Jeff Boudier, Simon Pagezy, and Florent Gbelidji from Hugging Face. Agentic AI systems represent an evolution from conversational AI to autonomous agents capable of complex reasoning, tool usage, and code execution. Enterprise applications benefit from strategic deployment approaches tailored to specific needs. These needs include managed endpoints, which deliver auto-scaling capabilities, foundation … Read more

Amazon SageMaker AI in 2025, a year in review part 1: Flexible Training Plans and improvements to price performance for inference workloads

In 2025, Amazon SageMaker AI saw dramatic improvements to core infrastructure offerings along four dimensions: capacity, price performance, observability, and usability. In this series of posts, we discuss these various improvements and their benefits. In Part 1, we discuss capacity improvements with the launch of Flexible Training Plans. We also describe improvements to price performance … Read more

Amazon SageMaker AI in 2025, a year in review part 2: Improved observability and enhanced features for SageMaker AI model customization and hosting

In 2025, Amazon SageMaker AI made several improvements designed to help you train, tune, and host generative AI workloads. In Part 1 of this series, we discussed Flexible Training Plans and price performance improvements made to inference components. In this post, we discuss enhancements made to observability, model customization, and model hosting. These improvements facilitate … Read more

Integrate external tools with Amazon Quick Agents using Model Context Protocol (MCP)

Amazon Quick supports Model Context Protocol (MCP) integrations for action execution, data access, and AI agent integration. You can expose your application’s capabilities as MCP tools by hosting your own MCP server and configuring an MCP integration in Amazon Quick. Amazon Quick acts as an MCP client and connects to your MCP server endpoint to access … Read more

Build AI workflows on Amazon EKS with Union.ai and Flyte

As artificial intelligence and machine learning (AI/ML) workflows grow in scale and complexity, it becomes harder for practitioners to organize and deploy their models. AI projects often struggle to move from pilot to production. AI projects often fail not because models are bad, but because infrastructure and processes are fragmented and brittle, and the original … Read more

Amazon Quick Suite now supports key pair authentication to Snowflake data source

Modern enterprises face significant challenges connecting business intelligence platforms to cloud data warehouses while maintaining automation. Password-based authentication introduces security vulnerabilities, operational friction, and compliance gaps—especially critical as Snowflake is deprecating username password. Amazon Quick Sight (a capability of Amazon Quick Suite) now supports key pair authentication for Snowflake integrations, using asymmetric cryptography where RSA … Read more

Build unified intelligence with Amazon Bedrock AgentCore

Building cohesive and unified customer intelligence across your organization starts with reducing the friction your sales representatives face when toggling between Salesforce, support tickets, and Amazon Redshift. A sales representative preparing for a customer meeting might spend hours clicking through several different dashboards—product recommendations, engagement metrics, revenue analytics, etc. – before developing a complete picture … Read more

Evaluating AI agents: Real-world lessons from building agentic systems at Amazon

The generative AI industry has undergone a significant transformation from using large language model (LLM)-driven applications to agentic AI systems, marking a fundamental shift in how AI capabilities are architected and deployed. While early generative AI applications primarily relied on LLMs to directly generate text and respond to prompts, the industry has evolved from those … Read more

Supercharge regulated workloads with Claude Code and Amazon Bedrock

The release of Anthropic Claude Sonnet 4.5 in the AWS GovCloud (US) Region introduces a straightforward on-ramp for AI-assisted development for workloads with regulatory compliance requirements. In this post, we explore how to combine Claude Sonnet 4.5 on Amazon Bedrock in AWS GovCloud (US) with Claude Code, an agentic coding assistant released by Anthropic. This … Read more

Customize AI agent browsing with proxies, profiles, and extensions in Amazon Bedrock AgentCore Browser

AI agents that browse the web need more than basic page navigation. Our customers tell us they need agents that maintain session state across interactions, route traffic through corporate proxy infrastructure, and run with custom browser configurations. AgentCore Browser provides a secure, isolated browser environment for your agents to interact with web applications. Until now, … Read more

AI meets HR: Transforming talent acquisition with Amazon Bedrock

Organizations face significant challenges in making their recruitment processes more efficient while maintaining fair hiring practices. By using AI to transform their recruitment and talent acquisition processes, organizations can overcome these challenges. AWS offers a suite of AI services that can be used to significantly enhance the efficiency, effectiveness, and fairness of hiring practices. With … Read more

Build long-running MCP servers on Amazon Bedrock AgentCore with Strands Agents integration

AI agents are rapidly evolving from mere chat interfaces into sophisticated autonomous workers that handle complex, time-intensive tasks. As organizations deploy agents to train machine learning (ML) models, process large datasets, and run extended simulations, the Model Context Protocol (MCP) has emerged as a standard for agent-server integrations. But a critical challenge remains: these operations … Read more

NVIDIA Nemotron 3 Nano 30B MoE model is now available in Amazon SageMaker JumpStart

Today we’re excited to announce that the NVIDIA Nemotron 3 Nano 30B model with  3B active parameters is now generally available in the Amazon SageMaker JumpStart model catalog. You can accelerate innovation and deliver tangible business value with Nemotron 3 Nano on Amazon Web Services (AWS) without having to manage model deployment complexities. You can … Read more

Mastering Amazon Bedrock throttling and service availability: A comprehensive guide

In production generative AI applications, we encounter a series of errors from time to time, and the most common ones are requests failing with 429 ThrottlingException and 503 ServiceUnavailableException errors. As a business application, these errors can happen due to multiple layers in the application architecture. Most of the cases in these errors are retriable … Read more

Swann provides Generative AI to millions of IoT Devices using Amazon Bedrock

If you’re managing Internet of Things (IoT) devices at scale, alert fatigue is probably undermining your system’s effectiveness. This post shows you how to implement intelligent notification filtering using Amazon Bedrock and its gen-AI capabilities. You’ll learn model selection strategies, cost optimization techniques, and architectural patterns for deploying gen-AI at IoT scale, based on Swann … Read more

How LinqAlpha assesses investment theses using Devil’s Advocate on Amazon Bedrock

This is a guest post by Suyeol Yun, Jaeseon Ha, Subeen Pang and Jacob (Chanyeol) Choi at LinqAlpha, in partnership with AWS. LinqAlpha is a Boston-based multi-agent AI system built specifically for institutional investors. Over 170 hedge funds and asset managers worldwide use LinqAlpha to streamline their investment research for public equities and other liquid … Read more

How Amazon uses Amazon Nova models to automate operational readiness testing for new fulfillment centers

Amazon is a global ecommerce and technology company that operates a vast network of fulfillment centers to store, process, and ship products to customers worldwide. The Amazon Global Engineering Services (GES) team is responsible for facilitating operational readiness across the company’s rapidly expanding network of fulfillment centers. When launching new fulfillment centers, Amazon must verify … Read more

Iberdrola enhances IT operations using Amazon Bedrock AgentCore

Iberdrola, one of the world’s largest utility companies, has embraced cutting-edge AI technology to revolutionize its IT operations in ServiceNow. By using different agentic architectures, Iberdrola has transformed the way thousands of change requests and incident tickets are managed, streamlining processes and enhancing productivity across departments. Through its partnership with AWS, Iberdrola implemented those agents … Read more

Building real-time voice assistants with Amazon Nova Sonic compared to cascading architectures

Voice AI agents are reshaping how we interact with technology. From customer service and healthcare assistance to home automation and personal productivity, these intelligent virtual assistants are rapidly gaining popularity across industries. Their natural language capabilities, constant availability, and increasing sophistication make them valuable tools for businesses seeking efficiency and individuals desiring seamless digital experiences. … Read more

Automated Reasoning checks rewriting chatbot reference implementation

Today, we are publishing a new open source sample chatbot that shows how to use feedback from Automated Reasoning checks to iterate on the generated content, ask clarifying questions, and prove the correctness of an answer. The chatbot implementation also produces an audit log that includes mathematically verifiable explanations for the answer validity and a … Read more

Scale LLM fine-tuning with Hugging Face and Amazon SageMaker AI

Enterprises are increasingly shifting from relying solely on large, general-purpose language models to developing specialized large language models (LLMs) fine-tuned on their own proprietary data. Although foundation models (FMs) offer impressive general capabilities, they often fall short when applied to the complexities of enterprise environments—where accuracy, security, compliance, and domain-specific knowledge are non-negotiable. To meet … Read more