Stop hand-tuning kernels: How Neuron Agentic Development accelerates AWS Trainium optimizations

As frontier AI models grow in scale and complexity, developers face a common challenge across every hardware platform: how do you extract the maximum performance and efficiency from the silicon their models run on. Whether delivering real-time experiences for world models, supporting deeper reasoning in agentic workflows, or reducing inference costs at scale, the gap … Read more

Build an AI-Powered Equipment Repair Assistant Using Amazon Bedrock AgentCore

Managing equipment repairs for heavy farm machinery often requires technicians to diagnose issues without the right parts, leading to multiple site visits, extended downtime, and substantial financial losses, especially during harvest season. In this post, you build an AI-powered equipment repair assistant using Amazon Bedrock AgentCore that helps farmers and field technicians diagnose equipment problems, … Read more

Scale Robot Reinforcement Learning with NVIDIA Isaac Lab on Amazon SageMaker AI

Physical AI is moving from research into production. Robots are increasingly trained in high-fidelity simulation before being deployed to factories, warehouses, and logistics centers, because training in the real world is slow, expensive, and often unsafe, while GPU-accelerated simulation can compress months of learning into hours. This shifts the challenge to compute. Reinforcement learning (RL) … Read more

Hands-free first notice of loss: Using Strands Agents and Amazon Bedrock AgentCore Browser Tool for intelligent claims intake

Turning multimodal first notice of loss (FNOL) evidence into tagged, decision-ready intake so adjusters start with context instead of raw artifacts. Manual FNOL processing consumes significant expert time on repetitive tasks because unstructured, multimodal evidence must be interpreted through portals designed for human interaction. Photos captured in the field, walkaround videos, scanned documents, and dictated … Read more

Build an agentic incident triage assistant with Amazon Quick and New Relic

Incident triage is time-sensitive because site reliability engineers (SREs) and support engineers often need to collect evidence, assess user impact, and create follow-up work across separate tools. With Amazon Quick and New Relic, you can coordinate those investigation and handoff steps in a single conversational workflow. This post shows engineering teams how to apply that … Read more

Unlocking AI flexibility in Europe: A guide to cross-region inference for EU data processing and model access

With access to the latest generative AI models and high-performance accelerated compute in high global demand, AWS customers need tools to take advantage of model availability and capacity across multiple AWS Regions, while still meeting their security and privacy requirements. cross-Region Inference (CRIS) on Amazon Bedrock meets these needs by automatically routing requests across multiple … Read more

Better decisions at scale: How mathematical optimization delivers where intuition fails

The science of optimal decisions — and how leading organizations are applying it. Every enterprise faces decisions that are too complex for intuition or manual decision-making alone. Which delivery routes minimize cost while meeting next-day promises? How should hundreds of robots sequence movements across a factory floor without collision? How do you staff a 24/7 … Read more

End-to-end encrypted ML inference with Amazon SageMaker AI and FHE

Machine learning (ML) inference often requires processing sensitive data—medical records, proprietary business information, or personal communications. What if you could run ML inference in the cloud while hiding your data from the cloud itself? More specifically, what if you could enforce that your data stayed encrypted throughout the entire ML inference process? This post will … Read more

Amazon Quick ARNs: Cross-account migration and namespace permissions

You migrate dashboards from development to production, but the permissions don’t carry over. You share a dashboard with your Finance team, but they keep getting “access denied.” You set up namespaces for multi-tenant isolation, and the same username works in one namespace but not another. These are real tasks that Amazon Quick administrators tackle regularly, … Read more

NVIDIA Nemotron 3 Ultra now available on Amazon SageMaker JumpStart

Today, we are excited to announce the day-zero availability of NVIDIA Nemotron 3 Ultra on Amazon SageMaker JumpStart. With this launch, you can now deploy the Nemotron 3 Ultra model using a one-click deployment experience. Nemotron 3 Ultra is an open model built for frontier reasoning and orchestration in long-running autonomous agents, delivering 5x faster … Read more

How to build self-driving AI operations on Amazon Bedrock at scale

Amazon Bedrock powers generative AI for more than 100,000 organizations worldwide—from startups to global enterprises across every industry. It provides the proven infrastructure and comprehensive capabilities to confidently build applications and agents that work in production with the flexibility, enterprise security, and proven scalability you need to innovate boldly and deliver AI that drives real … Read more

Fundamental’s Large Tabular Model NEXUS is now available on Amazon SageMaker JumpStart

Today, we’re announcing support for Fundamental’s NEXUS model on Amazon SageMaker AI. With this launch, you can deploy a foundation model (FM) purpose-built for tabular data prediction. This model helps your enterprise generate accurate, deterministic predictions from structured data in days instead of months. In this post, we show you how to get started with … Read more

Reducing container cold start times using SOCI index on DLAMI and DLC

Deep Learning AMI and AWS Deep Learning Containers are now enabled with support for SOCI snapshotter and index. Seekable OCI (SOCI) is a technology that enables efficient container image management through selective file downloading. It uses a layer-based indexing system to map file locations within container images, allowing containers to start with only the necessary … Read more

Improve your agent’s tool-calling accuracy with SFT and DPO on Amazon SageMaker AI

AI agents can autonomously handle complex, multi-step tasks, but their effectiveness depends on calling the right tools to retrieve information or take action. When an agent picks the wrong tool, formats parameters incorrectly, or breaks a workflow chain, task completion times grow, error rates rise, support costs increase, and user experiences degrade. As more organizations … Read more

The art and science of hyperparameter optimization on Amazon Nova Forge

Large language models (LLMs) deliver strong results on general tasks, but they often struggle with specialized work that requires understanding proprietary data, internal processes, or domain-specific terminology. Amazon Nova Forge addresses this by enabling you to build your own frontier models using Amazon Nova. You can start development from early model checkpoints, blend proprietary data … Read more

Object detection with Amazon Nova 2 Lite

Traditional computer vision solutions can require significant upfront investment. Setting up data pipelines, model training infrastructure, compute resources, and a dedicated data science team is often prohibitive for small companies or teams. Amazon Nova 2 Lite, available through Amazon Bedrock, provides an appealing alternative solution. This multimodal foundation model detects objects through natural language prompts … Read more

How Baz improved its AI Agent Code Review accuracy using Amazon Bedrock AgentCore

Code review was always manual and ineffective because of the inherent disconnect between code and product. Developers could review whether code compiled and worked, but not whether it fulfilled all functional and design requirements. In the past, QA teams spent hours manually clicking through preview environments to ensure features behaved as expected, and even more … Read more

Building a secure auth code flow setup using AgentCore Gateway with MCP clients

In modern development workflows, developers increasingly rely on agentic coding assistants such as Kiro Integrated Development Environment (IDE) to interact with remote tools and services. However, organizations require robust authentication mechanisms to provide secure, identity-verified access between these agentic coding assistants and enterprise Model Context Protocol (MCP) servers. Amazon Bedrock AgentCore is a fully managed … Read more

Reference your own AWS Secrets Manager secrets in Amazon Bedrock AgentCore Identity

AI agents are only as powerful as the tools they can access. Whether retrieving customer data from a CRM, posting updates to Slack, or querying a GitHub repository, agents need to call external APIs, and that means securely passing credentials at runtime. Getting that right, without hardcoding secrets in code or exposing them in agent … Read more

Transforming rare cancer research with Amazon Quick: Integrating biomedical databases for breakthrough discoveries

Rare cancer research generates heterogeneous data across genomic sequencing pipelines, clinical trial registries, biomarker repositories, and peer-reviewed literature. Integrating these sources for a single investigation typically requires custom ETL pipelines, manual schema reconciliation, and iterative querying across disconnected systems—a process that can take weeks before any analysis begins. Amazon Quick Research addresses this integration challenge … Read more

OpenAI models and Codex on Amazon Bedrock are now generally available

GPT-5.5, GPT-5.4, and Codex are now generally available on Amazon Bedrock. Deploy them in production applications and agents today, on Bedrock’s high performance inference engine. Key takeaways  GPT-5.5, the most advanced frontier model from OpenAI, is generally available on Amazon Bedrock. Pricing matches OpenAI first-party rates. Codex on Amazon Bedrock is generally available with pay-per-token pricing. Inference runs through Bedrock, and … Read more

Extending MCP support for Amazon Bedrock AgentCore Gateway

While deploying Model Context Protocol (MCP) servers in production, enterprises need fine-grained access control across servers, observability into which teams use which tools, security guarantees against data exfiltration, and centralized credential management, all at scale. Amazon Bedrock AgentCore Gateway sits between MCP servers and the clients that consume them, centralizing credential management, observability, and secure … Read more

Secure AI agents with Policy and Lambda interceptors in Amazon Bedrock AgentCore gateway

Securing AI agent behavior is a key customer challenge in building agentic solutions. As enterprises rapidly adopt AI agents to automate workflows, they face a scaling challenge in managing secure access to tools across the organization. Modern unified enterprise AI platforms have hundreds of agents serving users across the organization. These agents need to access … Read more

Enable safe agentic payments with built-in guardrails using Amazon Bedrock AgentCore payments

Agents increasingly take actions on behalf of their end users, whether that’s selecting tools, browsing the web, and calling MCP servers autonomously to achieve a goal. When the tools, MCP endpoints, or web resources an agent reaches are paid, the agent gets stuck without the ability to transact. Amazon Bedrock AgentCore payments, announced in preview … Read more

AgentOps: Operationalize agentic AI at scale with Amazon Bedrock AgentCore

When you build agentic AI solutions, you face unique operational challenges. Agents make unpredictable decisions, costs spiral unexpectedly, and debugging non-deterministic failures seems impossible. Agentic AI applications don’t just execute predetermined workflows. They reason, adapt, and make autonomous decisions, and DevOps practices need to be adapted. That’s where AgentOps comes in, the operational discipline for … Read more

Accelerate LLM model loading and increase context windows with GPUDirect on Amazon FSx for Lustre and TurboQuant

If you’re iterating on deploying large language models (LLMs) on AWS GPU instances, you’ve probably noticed the larger the model to be loaded into GPU High Bandwidth Memory (HBM), the longer the painful wait until the GPUs are ready for inference. As models grow to hundreds of billions of parameters and GPU environments grow ever … Read more

Amazon Quick integration with time-series databases for market intelligence using MCP

Model Context Protocol (MCP) integration in Amazon Quick transforms how financial analysts access time-series market intelligence, removing the need for complex database queries. As a financial analyst, you navigate millions of stock trades flowing through markets every second, searching for patterns that drive trading decisions. Financial institutions often use time series databases to analyze high-frequency … Read more

Comprehensive observability for Amazon SageMaker AI LLM inference: From GPU utilization to LLM quality

Deploying large language models (LLMs) at scale on Amazon SageMaker AI Inference makes observability a critical pillar of any production machine learning (ML) strategy. Unlike conventional software that returns deterministic outputs, LLMs generate variable, free-form responses that are difficult to validate with standard metrics. LLM output quality can change over time as input distributions shift, … Read more