AWSML – Page 13 – Kamal Reader

Build modern serverless solutions following best practices using Amazon Q Developer CLI and MCP

July 28, 2025 by kamal

Building modern serverless applications on AWS requires navigating best practices to manage the integration between multiple services, such as AWS Lambda, Amazon API Gateway, Amazon DynamoDB, and Amazon EventBridge. Security considerations, performance optimization, and implementing a comprehensive monitoring systems adds further requirements to build a serverless architecture while adhering to AWS best practices. Amazon Q Developer CLI with Model Context Protocol … Read more

Build an intelligent eDiscovery solution using Amazon Bedrock Agents

July 25, 2025 by kamal

Legal teams spend bulk of their time manually reviewing documents during eDiscovery. This process involves analyzing electronically stored information across emails, contracts, financial records, and collaboration systems for legal proceedings. This manual approach creates significant bottlenecks: attorneys must identify privileged communications, assess legal risks, extract contractual obligations, and maintain regulatory compliance across thousands of documents … Read more

How PerformLine uses prompt engineering on Amazon Bedrock to detect compliance violations

July 25, 2025 by kamal

This post is co-written with Bogdan Arsenie and Nick Mattei from PerformLine. PerformLine operates within the marketing compliance industry, a specialized subset of the broader compliance software market, which includes various compliance solutions like anti-money laundering (AML), know your customer (KYC), and others. Specifically, marketing compliance refers to adhering to regulations and guidelines set by … Read more

Boost cold-start recommendations with vLLM on AWS Trainium

July 24, 2025 by kamal

Cold start in recommendation systems goes beyond just new user or new item problems—it’s the complete absence of personalized signals at launch. When someone first arrives, or when fresh content appears, there’s no behavioral history to tell the engine what they care about, so everyone ends up in broad generic segments. That not only dampens … Read more

Benchmarking Amazon Nova: A comprehensive analysis through MT-Bench and Arena-Hard-Auto

July 24, 2025 by kamal

Large language models (LLMs) have rapidly evolved, becoming integral to applications ranging from conversational AI to complex reasoning tasks. However, as models grow in size and capability, effectively evaluating their performance has become increasingly challenging. Traditional benchmarking metrics like perplexity and BLEU scores often fail to capture the nuances of real-world interactions, making human-aligned evaluation … Read more

Customize Amazon Nova in Amazon SageMaker AI using Direct Preference Optimization

July 23, 2025 by kamal

At the AWS Summit in New York City, we introduced a comprehensive suite of model customization capabilities for Amazon Nova foundation models. Available as ready-to-use recipes on Amazon SageMaker AI, you can use them to adapt Nova Micro, Nova Lite, and Nova Pro across the model training lifecycle, including pre-training, supervised fine-tuning, and alignment. In this … Read more

Multi-tenant RAG implementation with Amazon Bedrock and Amazon OpenSearch Service for SaaS using JWT

July 23, 2025 by kamal

In recent years, the emergence of large language models (LLMs) has accelerated AI adoption across various industries. However, to further augment LLMs’ capabilities and effectively use up-to-date information and domain-specific knowledge, integration with external data sources is essential. Retrieval Augmented Generation (RAG) has gained attention as an effective approach to address this challenge. RAG is … Read more

Enhance generative AI solutions using Amazon Q index with Model Context Protocol – Part 1

July 23, 2025 by kamal

Today’s enterprises increasingly rely on AI-driven applications to enhance decision-making, streamline workflows, and deliver improved customer experiences. Achieving these outcomes demands secure, timely, and accurate access to authoritative data—especially when such data resides across diverse repositories and applications within strict enterprise security boundaries. Interoperable technologies powered by open standards like the Model Context Protocol (MCP) … Read more

Beyond accelerators: Lessons from building foundation models on AWS with Japan’s GENIAC program

July 22, 2025 by kamal

In 2024, the Ministry of Economy, Trade and Industry (METI) launched the Generative AI Accelerator Challenge (GENIAC)—a Japanese national program to boost generative AI by providing companies with funding, mentorship, and massive compute resources for foundation model (FM) development. AWS was selected as the cloud provider for GENIAC’s second cycle (cycle 2). It provided infrastructure … Read more

Streamline deep learning environments with Amazon Q Developer and MCP

July 22, 2025 by kamal

Data science teams working with artificial intelligence and machine learning (AI/ML) face a growing challenge as models become more complex. While Amazon Deep Learning Containers (DLCs) offer robust baseline environments out-of-the-box, customizing them for specific projects often requires significant time and expertise. In this post, we explore how to use Amazon Q Developer and Model … Read more

Build an AI-powered automated summarization system with Amazon Bedrock and Amazon Transcribe using Terraform

July 21, 2025 by kamal

Extracting meaningful insights from unstructured data presents significant challenges for many organizations. Meeting recordings, customer interactions, and interviews contain invaluable business intelligence that remains largely inaccessible due to the prohibitive time and resource costs of manual review. Organizations frequently struggle to efficiently capture and use key information from these interactions, resulting in not only productivity … Read more

Kyruus builds a generative AI provider matching solution on AWS

July 21, 2025 by kamal

This post was written with Zach Heath of Kyruus Health. When health plan members need care, they shouldn’t need a dictionary. Yet millions face this exact challenge—describing symptoms in everyday language while healthcare references clinical terminology and complex specialty classifications. This disconnect forces members to become amateur medical translators, attempting to convert phrases like “my … Read more

Use generative AI in Amazon Bedrock for enhanced recommendation generation in equipment maintenance

July 21, 2025 by kamal

In the manufacturing world, valuable insights from service reports often remain underutilized in document storage systems. This post explores how Amazon Web Services (AWS) customers can build a solution that automates the digitisation and extraction of crucial information from many reports using generative AI. The solution uses Amazon Nova Pro on Amazon Bedrock and Amazon … Read more

Build real-time travel recommendations using AI agents on Amazon Bedrock

July 18, 2025 by kamal

Generative AI is transforming how businesses deliver personalized experiences across industries, including travel and hospitality. Travel agents are enhancing their services by offering personalized holiday packages, carefully curated for customer’s unique preferences, including accessibility needs, dietary restrictions, and activity interests. Meeting these expectations requires a solution that combines comprehensive travel knowledge with real-time pricing and … Read more

Deploy a full stack voice AI agent with Amazon Nova Sonic

July 18, 2025 by kamal

AI-powered speech solutions are transforming contact centers by enabling natural conversations between customers and AI agents, shortening wait times, and dramatically reducing operational costs—all without sacrificing the human-like interaction customers expect. With the recent launch of Amazon Nova Sonic in Amazon Bedrock, you can now build sophisticated conversational AI agents that communicate naturally through voice, … Read more

Manage multi-tenant Amazon Bedrock costs using application inference profiles

July 18, 2025 by kamal

Successful generative AI software as a service (SaaS) systems require a balance between service scalability and cost management. This becomes critical when building a multi-tenant generative AI service designed to serve a large, diverse customer base while maintaining rigorous cost controls and comprehensive usage monitoring. Traditional cost management approaches for such systems often reveal limitations. … Read more

Evaluating generative AI models with Amazon Nova LLM-as-a-Judge on Amazon SageMaker AI

July 17, 2025 by kamal

Evaluating the performance of large language models (LLMs) goes beyond statistical metrics like perplexity or bilingual evaluation understudy (BLEU) scores. For most real-world generative AI scenarios, it’s crucial to understand whether a model is producing better outputs than a baseline or an earlier iteration. This is especially important for applications such as summarization, content generation, … Read more

Building cost-effective RAG applications with Amazon Bedrock Knowledge Bases and Amazon S3 Vectors

July 17, 2025 by kamal

Vector embeddings have become essential for modern Retrieval Augmented Generation (RAG) applications, but organizations face significant cost challenges as they scale. As knowledge bases grow and require more granular embeddings, many vector databases that rely on high-performance storage such as SSDs or in-memory solutions become prohibitively expensive. This cost barrier often forces organizations to limit … Read more

Implementing on-demand deployment with customized Amazon Nova models on Amazon Bedrock

July 17, 2025 by kamal

Amazon Bedrock offers model customization capabilities for customers to tailor versions of foundation models (FMs) to their specific needs through features such as fine-tuning and distillation. Today, we’re announcing the launch of on-demand deployment for customized models ready to be deployed on Amazon Bedrock. On-demand deployment for customized models provides an additional deployment option that … Read more

Building enterprise-scale RAG applications with Amazon S3 Vectors and DeepSeek R1 on Amazon SageMaker AI

July 17, 2025 by kamal

Organizations are adopting large language models (LLMs), such as DeepSeek R1, to transform business processes, enhance customer experiences, and drive innovation at unprecedented speed. However, standalone LLMs have key limitations such as hallucinations, outdated knowledge, and no access to proprietary data. Retrieval Augmented Generation (RAG) addresses these gaps by combining semantic search with generative AI, … Read more

Accenture scales video analysis with Amazon Nova and Amazon Bedrock Agents

July 16, 2025 by kamal

This post was written with Ilan Geller, Kamal Mannar, Debasmita Ghosh, and Nakul Aggarwal of Accenture. Video highlights offer a powerful way to boost audience engagement and extend content value for content publishers. These short, high-impact clips capture key moments that drive viewer retention, amplify reach across social media, reinforce brand identity, and open new … Read more

Deploy conversational agents with Vonage and Amazon Nova Sonic

July 16, 2025 by kamal

This post is co-written with Mark Berkeland, Oscar Rodriguez and Marina Gerzon from Vonage. Voice-based technologies are transforming the way businesses engage with customers across customer support, virtual assistants, and intelligent agents. However, creating real-time, expressive, and highly responsive voice interfaces still requires navigating a complex stack of communication protocols, AI models, and media infrastructure. … Read more

Enabling customers to deliver production-ready AI agents at scale

July 16, 2025 by kamal

AI agents will change how we all work and live. Our AWS CEO, Matt Garman, shared a vision of a technological shift as transformative as the advent of the internet. I’m energized by this vision because I’ve witnessed firsthand how these intelligent agent systems are already beginning to solve complex problems, automate workflows, and create … Read more

Amazon Bedrock Knowledge Bases now supports Amazon OpenSearch Service Managed Cluster as vector store

July 15, 2025 by kamal

Amazon Bedrock Knowledge Bases has extended its vector store options by enabling support for Amazon OpenSearch Service managed clusters, further strengthening its capabilities as a fully managed Retrieval Augmented Generation (RAG) solution. This enhancement builds on the core functionality of Amazon Bedrock Knowledge Bases , which is designed to seamlessly connect foundation models (FMs) with … Read more

Monitor agents built on Amazon Bedrock with Datadog LLM Observability

July 15, 2025 by kamal

This post was co-written with Mohammad Jama, Yun Kim, and Barry Eom from Datadog. The emergence of generative AI agents in recent years has transformed the AI landscape, driven by advances in large language models (LLMs) and natural language processing (NLP). The focus is shifting from simple AI assistants to Agentic AI systems that can … Read more

How PayU built a secure enterprise AI assistant using Amazon Bedrock

July 15, 2025 by kamal

This is a guest post co-written with Rahul Ghosh, Sandeep Kumar Veerlapati, Rahmat Khan, and Mudit Chopra from PayU. PayU offers a full-stack digital financial services system that serves the financial needs of merchants, banks, and consumers through technology. As a Central Bank-regulated financial institution in India, we recently observed a surge in our employees’ … Read more

Supercharge generative AI workflows with NVIDIA DGX Cloud on AWS and Amazon Bedrock Custom Model Import

July 15, 2025 by kamal

This post is co-written with Andrew Liu, Chelsea Isaac, Zoey Zhang, and Charlie Huang from NVIDIA. DGX Cloud on Amazon Web Services (AWS) represents a significant leap forward in democratizing access to high-performance AI infrastructure. By combining NVIDIA GPU expertise with AWS scalable cloud services, organizations can accelerate their time-to-train, reduce operational complexity, and unlock … Read more

Accelerate generative AI inference with NVIDIA Dynamo and Amazon EKS

July 15, 2025 by kamal

This post is co-written with Kshitiz Gupta, Wenhan Tan, Arun Raman, Jiahong Liu, and Eiluth Triana Isaza from NVIDIA. As large language models (LLMs) and generative AI applications become increasingly prevalent, the demand for efficient, scalable, and low-latency inference solutions has grown. Traditional inference systems often struggle to meet these demands, especially in distributed, multi-node … Read more

AWS doubles investment in AWS Generative AI Innovation Center, marking two years of customer success

July 15, 2025 by kamal

When we launched the AWS Generative AI Innovation Center in 2023, we had one clear goal: help customers turn AI potential into real business value. We’ve already guided thousands of customers across industries from financial services to healthcare—including Formula 1, FOX, GovTech Singapore, Itaú Unibanco, Nasdaq, NFL, RyanAir, and S&P Global—from AI experimentation to full-scale … Read more

Build AI-driven policy creation for vehicle data collection and automation using Amazon Bedrock

July 14, 2025 by kamal

Vehicle data is critical for original equipment manufacturers (OEMs) to drive continuous product innovation and performance improvements and to support new value-added services. Similarly, the increasing digitalization of vehicle architectures and adoption of software-configurable functions allow OEMs to add new features and capabilities efficiently. Sonatus’s Collector AI and Automator AI products address these two aspects … Read more