Migrate MLflow tracking servers to Amazon SageMaker AI with serverless MLflow

Operating a self-managed MLflow tracking server comes with administrative overhead, including server maintenance and resource scaling. As teams scale their ML experimentation, efficiently managing resources during peak usage and idle periods is a challenge. Organizations running MLflow on Amazon EC2 or on-premises can optimize costs and engineering resources by using Amazon SageMaker AI with serverless … Read more

Build an AI-powered website assistant with Amazon Bedrock

Businesses face a growing challenge: customers need answers fast, but support teams are overwhelmed. Support documentation like product manuals and knowledge base articles typically require users to search through hundreds of pages, and support agents often run 20–30 customer queries per day to locate specific information. This post demonstrates how to solve this challenge by … Read more

Programmatically creating an IDP solution with Amazon Bedrock Data Automation

Intelligent Document Processing (IDP) transforms how organizations handle unstructured document data, enabling automatic extraction of valuable information from invoices, contracts, and reports. Today, we explore how to programmatically create an IDP solution that uses Strands SDK, Amazon Bedrock AgentCore, Amazon Bedrock Knowledge Base, and Bedrock Data Automation (BDA). This solution is provided through a Jupyter notebook that enables users … Read more

AI agent-driven browser automation for enterprise workflow management

Enterprise organizations increasingly rely on web-based applications for critical business processes, yet many workflows remain manually intensive, creating operational inefficiencies and compliance risks. Despite significant technology investments, knowledge workers routinely navigate between eight to twelve different web applications during standard workflows, constantly switching contexts and manually transferring information between systems. Data entry and validation tasks … Read more

Agentic QA automation using Amazon Bedrock AgentCore Browser and Amazon Nova Act

Quality assurance (QA) testing has long been the backbone of software development, but traditional QA approaches haven’t kept pace with modern development cycles and complex UIs. Most organizations still rely on a hybrid approach combining manual testing with script-based automation frameworks like Selenium, Cypress, and Playwright—yet teams spend significant amount of their time maintaining existing … Read more

Optimizing LLM inference on Amazon SageMaker AI with BentoML’s LLM- Optimizer

The rise of powerful large language models (LLMs) that can be consumed via API calls has made it remarkably straightforward to integrate artificial intelligence (AI) capabilities into applications. Yet despite this convenience, a significant number of enterprises are choosing to self-host their own models—accepting the complexity of infrastructure management, the cost of GPUs in the … Read more

AWS AI League: Model customization and agentic showdown

Building intelligent agents to handle complex, real-world tasks can be daunting. Additionally, rather than relying solely on large, pre-trained foundation models, organizations often need to fine-tune and customize smaller, more specialized models to outperform them for their specific use cases. The AWS AI League provides an innovative program to help enterprises overcome the challenges of building … Read more

Accelerate Enterprise AI Development using Weights & Biases and Amazon Bedrock AgentCore

This post is co-written by Thomas Capelle and Ray Strickland from Weights & Biases (W&B). Generative artificial intelligence (AI) adoption is accelerating across enterprises, evolving from simple foundation model interactions to sophisticated agentic workflows. As organizations transition from proof-of-concepts to production deployments, they require robust tools for development, evaluation, and monitoring of AI applications at … Read more

How dLocal automated compliance reviews using Amazon Quick Automate

dLocal, Uruguay’s first unicorn, has established itself as a pioneer in cross-border payments since its founding in 2016. Today, the company operates in over 40 emerging countries, connecting more than two billion consumers with global technology leaders. Operating at this scale requires strict and consistent compliance processes. Each month, thousands of merchant ecommerce websites are … Read more

Advancing ADHD diagnosis: How Qbtech built a mobile AI assessment Model Using Amazon SageMaker AI

This post is cowritten with Dr. Mikkel Hansen from Qbtech. The assessment and diagnosis of attention deficit hyperactive disorder (ADHD) has traditionally relied on clinical observations and behavioral evaluations. While these methods are valuable, the process can be complex and time-intensive. Qbtech, founded in 2002 in Stockholm, Sweden, enhances ADHD diagnosis by integrating objective measurements … Read more

Accelerating your marketing ideation with generative AI – Part 1: From idea to generation with the Amazon Nova foundation models

Marketing teams face increasing pressure to create engaging campaigns quickly while maintaining brand consistency and creative quality. Traditional marketing campaign creation processes often involve multiple iterations between creative teams, stakeholders, and external agencies, leading to extended timelines and increased costs. The advent and availability of generative models (especially image and video generation ones) has opened … Read more

Introducing Visa Intelligent Commerce on AWS: Enabling agentic commerce with Amazon Bedrock AgentCore

This post is cowritten with Sangeetha Bharath and Seemal Zaman from Visa. Across every industry, agentic AI is redefining how work gets done by shifting digital experiences from manual, user-driven interactions to autonomous, outcome-driven workflows. Unlike traditional AI systems that merely answer questions or provide suggestions, agentic AI introduces intelligent agents capable of reasoning, acting, … Read more

Move Beyond Chain-of-Thought with Chain-of-Draft on Amazon Bedrock

As organizations scale their generative AI implementations, the critical challenge of balancing quality, cost, and latency becomes increasingly complex. With inference costs dominating 70–90% of large language model (LLM) operational expenses, and verbose prompting strategies inflating token volume by 3–5x, organizations are actively seeking more efficient approaches to model interaction. Traditional prompting methods, while effective, … Read more

Deploy Mistral AI’s Voxtral on Amazon SageMaker AI

Mistral AI’s Voxtral models combine text and audio processing capabilities in a single framework. The Voxtral family includes two distinct variants designed for different use cases and resource requirements. The Voxtral-Mini-3B-2507 is a compact 3-billion-parameter model optimized for efficient audio transcription and basic multimodal understanding, making it ideal for applications where speed and resource efficiency … Read more

Enhance document analytics with Strands AI Agents for the GenAI IDP Accelerator

Extracting structured information from unstructured data is a critical first step to unlocking business value. Our Generative AI Intelligent Document Processing (GenAI IDP) Accelerator has been at the forefront of this transformation, already having processed tens of millions of documents for hundreds of customers. Although organizations can use intelligent document processing (IDP) solutions to digitize … Read more

Build a multimodal generative AI assistant for root cause diagnosis in predictive maintenance using Amazon Bedrock

Predictive maintenance is a strategy that uses data from equipment sensors and advanced analytics to predict when a machine is likely to fail, ensuring maintenance can be performed proactively to prevent breakdowns. This enables industries to reduce unexpected failures, improve operational efficiency, and extend the lifespan of critical equipment. It is applicable across a wide range of components, … Read more

Introducing SOCI indexing for Amazon SageMaker Studio: Faster container startup times for AI/ML workloads

Today, we are excited to introduce a new feature for SageMaker Studio: SOCI (Seekable Open Container Initiative) indexing. SOCI supports lazy loading of container images, where only the necessary parts of an image are downloaded initially rather than the entire container. SageMaker Studio serves as a web Integrated Development Environment (IDE) for end-to-end machine learning (ML) development, … Read more

Build and deploy scalable AI agents with NVIDIA NeMo, Amazon Bedrock AgentCore, and Strands Agents

This post is co-written with Ranjit Rajan, Abdullahi Olaoye, and Abhishek Sawarkar from NVIDIA. AI’s next frontier isn’t merely smarter chat-based assistants, it’s autonomous agents that reason, plan, and execute across entire systems. But to accomplish this, enterprise developers need to move from prototypes to production-ready AI agents that scale securely. This challenge grows as … Read more

Bi-directional streaming for real-time agent interactions now available in Amazon Bedrock AgentCore Runtime

Building natural voice conversations with AI agents requires complex infrastructure and lots of code from engineering teams. Text-based agent interactions follow a turn-based pattern: a user sends a complete request, waits for the agent to process it, and receives a full response before continuing. Bi-directional streaming removes this constraint by establishing a persistent connection that … Read more

Tracking and managing assets used in AI development with Amazon SageMaker AI 

Building custom foundation models requires coordinating multiple assets across the development lifecycle such as data assets, compute infrastructure, model architecture and frameworks, lineage, and production deployments. Data scientists create and refine training datasets, develop custom evaluators to assess model quality and safety, and iterate through fine-tuning configurations to optimize performance. As these workflows scale across … Read more

Track machine learning experiments with MLflow on Amazon SageMaker using Snowflake integration

A user can conduct machine learning (ML) data experiments in data environments, such as Snowflake, using the Snowpark library. However, tracking these experiments across diverse environments can be challenging due to the difficulty in maintaining a central repository to monitor experiment metadata, parameters, hyperparameters, models, results, and other pertinent information. In this post, we demonstrate … Read more

Governance by design: The essential guide for successful AI scaling

Picture this: Your enterprise has just deployed its first generative AI application. The initial results are promising, but as you plan to scale across departments, critical questions emerge. How will you enforce consistent security, prevent model bias, and maintain control as AI applications multiply? It turns out you’re not alone. A McKinsey survey spanning 750+ … Read more

How Tata Power CoE built a scalable AI-powered solar panel inspection solution with Amazon SageMaker AI and Amazon Bedrock

This post is co-written with Vikram Bansal from Tata Power, and Gaurav Kankaria, Omkar Dhavalikar from Oneture. The global adoption of solar energy is rapidly increasing as organizations and individuals transition to renewable energy sources. India is on the brink of a solar energy revolution, with a national goal to empower 10 million households with … Read more

Unlocking video understanding with TwelveLabs Marengo on Amazon Bedrock

Media and entertainment, advertising, education, and enterprise training content combines visual, audio, and motion elements to tell stories and convey information, making it far more complex than text where individual words have clear meanings. This creates unique challenges for AI systems that need to understand video content. Video content is multidimensional, combining visual elements (scenes, … Read more

Checkpointless training on Amazon SageMaker HyperPod: Production-scale training with faster fault recovery

Foundation model training has reached an inflection point where traditional checkpoint-based recovery methods are becoming a bottleneck to efficiency and cost-effectiveness. As models grow to trillions of parameters and training clusters expand to thousands of AI accelerators, even minor disruptions can result in significant costs and delays. In this post, we introduce checkpointless training on … Read more

Adaptive infrastructure for foundation model training with elastic training on SageMaker HyperPod

Modern AI infrastructure serves multiple concurrent workloads on the same cluster, from foundation model (FM) pre-training and fine-tuning to production inference and evaluation. In this shared environment, the demands for AI accelerators fluctuates continuously as inference workloads scale with traffic patterns, and experiments complete and release resources. Despite this dynamic availability of AI accelerators, traditional … Read more

Customize agent workflows with advanced orchestration techniques using Strands Agents

Large Language Model (LLM) agents have revolutionized how we approach complex, multi-step tasks by combining the reasoning capabilities of foundation models with specialized tools and domain expertise. While single-agent systems using frameworks like ReAct work well for straightforward tasks, real-world challenges often require multiple specialized agents working in coordination. Think about planning a business trip: … Read more

Operationalize generative AI workloads and scale to hundreds of use cases with Amazon Bedrock – Part 1: GenAIOps

Enterprise organizations are rapidly moving beyond generative AI experiments to production deployments and complex agentic AI solutions, facing new challenges in scaling, security, governance, and operational efficiency. This blog post series introduces generative AI operations (GenAIOps), the application of DevOps principles to generative AI solutions, and demonstrates how to implement it for applications powered by … Read more

Applying data loading best practices for ML training with Amazon S3 clients

Amazon Simple Storage Service (Amazon S3) is a highly elastic service that automatically scales with application demand, offering the high throughput performance required for modern ML workloads. High-performance client connectors such as the Amazon S3 Connector for PyTorch and Mountpoint for Amazon S3 provide native S3 integration in training pipelines without dealing directly with the … Read more