Protect sensitive data in RAG applications with Amazon Bedrock

Retrieval Augmented Generation (RAG) applications have become increasingly popular due to their ability to enhance generative AI tasks with contextually relevant information. Implementing RAG-based applications requires careful attention to security, particularly when handling sensitive data. The protection of personally identifiable information (PII), protected health information (PHI), and confidential business data is crucial because this information … Read more

Supercharge your LLM performance with Amazon SageMaker Large Model Inference container v15

Today, we’re excited to announce the launch of Amazon SageMaker Large Model Inference (LMI) container v15, powered by vLLM 0.8.4 with support for the vLLM V1 engine. This version now supports the latest open-source models, such as Meta’s Llama 4 models Scout and Maverick, Google’s Gemma 3, Alibaba’s Qwen, Mistral AI, DeepSeek-R, and many more. … Read more

Accuracy evaluation framework for Amazon Q Business – Part 2

In the first post of this series, we introduced a comprehensive evaluation framework for Amazon Q Business, a fully managed Retrieval Augmented Generation (RAG) solution that uses your company’s proprietary data without the complexity of managing large language models (LLMs). The first post focused on selecting appropriate use cases, preparing data, and implementing metrics to … Read more

Use Amazon Bedrock Intelligent Prompt Routing for cost and latency benefits

In December, we announced the preview availability for Amazon Bedrock Intelligent Prompt Routing, which provides a single serverless endpoint to efficiently route requests between different foundation models within the same model family. To do this, Amazon Bedrock Intelligent Prompt Routing dynamically predicts the response quality of each model for a request and routes the request to … Read more

How Infosys improved accessibility for Event Knowledge using Amazon Nova Pro, Amazon Bedrock and Amazon Elemental Media Services

This post is co-written with Saibal Samaddar, Tanushree Halder, and Lokesh Joshi from Infosys Consulting. Critical insights and expertise are concentrated among thought leaders and experts across the globe. Language barriers often hinder the distribution and comprehension of this knowledge during crucial encounters. Workshops, conferences, and training sessions serve as platforms for collaboration and knowledge … Read more

Amazon Bedrock Prompt Optimization Drives LLM Applications Innovation for Yuewen Group

Yuewen Group is a global leader in online literature and IP operations. Through its overseas platform WebNovel, it has attracted about 260 million users in over 200 countries and regions, promoting Chinese web literature globally. The company also adapts quality web novels into films, animations for international markets, expanding the global influence of Chinese culture. … Read more

Build a location-aware agent using Amazon Bedrock Agents and Foursquare APIs

This post is co-written with Vikram Gundeti and Nate Folkert from Foursquare. Personalization is key to creating memorable experiences. Whether it’s recommending the perfect movie or suggesting a new restaurant, tailoring suggestions to individual preferences can make all the difference. But when it comes to food and activities, there’s more to consider than just personal … Read more

Build an automated generative AI solution evaluation pipeline with Amazon Nova

Large language models (LLMs) have become integral to numerous applications across industries, ranging from enhanced customer interactions to automated business processes. Deploying these models in real-world scenarios presents significant challenges, particularly in ensuring accuracy, fairness, relevance, and mitigating hallucinations. Thorough evaluation of the performance and outputs of these models is therefore critical to maintaining trust … Read more

Build a FinOps agent using Amazon Bedrock with multi-agent capability and Amazon Nova as the foundation model

AI agents are revolutionizing how businesses enhance their operational capabilities and enterprise applications. By enabling natural language interactions, these agents provide customers with a streamlined, personalized experience. Amazon Bedrock Agents uses the capabilities of foundation models (FMs), combining them with APIs and data to process user requests, gather information, and execute specific tasks effectively. The … Read more

Stream ingest data from Kafka to Amazon Bedrock Knowledge Bases using custom connectors

Retrieval Augmented Generation (RAG) enhances AI responses by combining the generative AI model’s capabilities with information from external data sources, rather than relying solely on the model’s built-in knowledge. In this post, we showcase the custom data connector capability in Amazon Bedrock Knowledge Bases that makes it straightforward to build RAG workflows with custom input … Read more

Add Zoom as a data accessor to your Amazon Q index

For many organizations, vast amounts of enterprise knowledge are scattered across diverse data sources and applications. Organizations across industries seek to use this cross-application enterprise data from within their preferred systems while adhering to their established security and governance standards. This post demonstrates how Zoom users can access their Amazon Q Business enterprise data directly … Read more

The future of quality assurance: Shift-left testing with QyrusAI and Amazon Bedrock

This post is co-written with Ameet Deshpande and Vatsal Saglani from Qyrus. As businesses embrace accelerated development cycles to stay competitive, maintaining rigorous quality standards can pose a significant challenge. Traditional testing methods, which occur late in the development cycle, often result in delays, increased costs, and compromised quality. Shift-left testing, which emphasizes earlier testing … Read more

Automate video insights for contextual advertising using Amazon Bedrock Data Automation

Contextual advertising, a strategy that matches ads with relevant digital content, has transformed digital marketing by delivering personalized experiences to viewers. However, implementing this approach for streaming video-on-demand (VOD) content poses significant challenges, particularly in ad placement and relevance. Traditional methods rely heavily on manual content analysis. For example, a content analyst might spend hours … Read more

How Salesforce achieves high-performance model deployment with Amazon SageMaker AI

This post is a joint collaboration between Salesforce and AWS and is being cross-published on both the Salesforce Engineering Blog and the AWS Machine Learning Blog. The Salesforce AI Model Serving team is working to push the boundaries of natural language processing and AI capabilities for enterprise applications. Their key focus areas include optimizing large … Read more

Automate Amazon EKS troubleshooting using an Amazon Bedrock agentic workflow

As organizations scale their Amazon Elastic Kubernetes Service (Amazon EKS) deployments, platform administrators face increasing challenges in efficiently managing multi-tenant clusters. Tasks such as investigating pod failures, addressing resource constraints, and resolving misconfiguration can consume significant time and effort. Instead of spending valuable engineering hours manually parsing logs, tracking metrics, and implementing fixes, teams should … Read more

Host concurrent LLMs with LoRAX

Businesses are increasingly seeking domain-adapted and specialized foundation models (FMs) to meet specific needs in areas such as document summarization, industry-specific adaptations, and technical code generation and advisory. The increased usage of generative AI models has offered tailored experiences with minimal technical expertise, and organizations are increasingly using these powerful models to drive innovation and … Read more

Build a computer vision-based asset inventory application with low or no training

Keeping an up-to-date asset inventory with real devices deployed in the field can be a challenging and time-consuming task. Many electricity providers use manufacturer’s labels as key information to link their physical assets within asset inventory systems. Computer vision can be a viable solution to speed up operator inspections and reduce human errors by automatically … Read more

Clario enhances the quality of the clinical trial documentation process with Amazon Bedrock

This post is co-written with Kim Nguyen and Shyam Banuprakash from Clario. Clario is a leading provider of endpoint data solutions to the clinical trials industry, generating high-quality clinical evidence for life sciences companies seeking to bring new therapies to patients. Since Clario’s founding more than 50 years ago, the company’s endpoint data solutions have … Read more

Optimizing Mixtral 8x7B on Amazon SageMaker with AWS Inferentia2

Organizations are constantly seeking ways to harness the power of advanced large language models (LLMs) to enable a wide range of applications such as text generation, summarizationquestion answering, and many others. As these models grow more powerful and capable, deploying them in production environments while optimizing performance and cost-efficiency becomes more challenging. Amazon Web Services … Read more

Elevate business productivity with Amazon Q and Amazon Connect

Modern banking faces dual challenges: delivering rapid loan processing while maintaining robust security against sophisticated fraud. Amazon Q Business provides AI-driven analysis of regulatory requirements and lending patterns. Additionally, you can now report fraud from the same interface with a custom plugin capability that can integrate with Amazon Connect. This fusion of technology transforms traditional … Read more

Build multi-agent systems with LangGraph and Amazon Bedrock

Large language models (LLMs) have raised the bar for human-computer interaction where the expectation from users is that they can communicate with their applications through natural language. Beyond simple language understanding, real-world applications require managing complex workflows, connecting to external data, and coordinating multiple AI capabilities. Imagine scheduling a doctor’s appointment where an AI agent … Read more

Dynamic text-to-SQL for enterprise workloads with Amazon Bedrock Agents

Generative AI enables us to accomplish more in less time. Text-to-SQL empowers people to explore data and draw insights using natural language, without requiring specialized database knowledge. Amazon Web Services (AWS) has helped many customers connect this text-to-SQL capability with their own data, which means more employees can generate insights. In this process, we discovered … Read more

Building an AIOps chatbot with Amazon Q Business custom plugins

Many organizations rely on multiple third-party applications and services for different aspects of their operations, such as scheduling, HR management, financial data, customer relationship management (CRM) systems, and more. However, these systems often exist in silos, requiring users to manually navigate different interfaces, switch between environments, and perform repetitive tasks, which can be time-consuming and … Read more

How TransPerfect Improved Translation Quality and Efficiency Using Amazon Bedrock

This post is co-written with Keith Brazil, Julien Didier, and Bryan Rand from TransPerfect. TransPerfect, a global leader in language and technology solutions, serves a diverse array of industries. Founded in 1992, TransPerfect has grown into an enterprise with over 10,000 employees in more than 140 cities on six continents. The company offers a broad … Read more

Reduce ML training costs with Amazon SageMaker HyperPod

Training a frontier model is highly compute-intensive, requiring a distributed system of hundreds, or thousands, of accelerated instances running for several weeks or months to complete a single job. For example, pre-training the Llama 3 70B model with 15 trillion training tokens took 6.5 million H100 GPU hours. On 256 Amazon EC2 P5 instances (p5.48xlarge, … Read more

Model customization, RAG, or both: A case study with Amazon Nova

As businesses and developers increasingly seek to optimize their language models for specific tasks, the decision between model customization and Retrieval Augmented Generation (RAG) becomes critical. In this post, we seek to address this growing need by offering clear, actionable guidelines and best practices on when to use each approach, helping you make informed decisions … Read more

Generate user-personalized communication with Amazon Personalize and Amazon Bedrock

Today, businesses are using AI and generative models to improve productivity in their teams and provide better experiences to their customers. Personalized outbound communication can be a powerful tool to increase user engagement and conversion. For instance, as a marketing manager for a video-on-demand company, you might want to send personalized email messages tailored to … Read more

Automating regulatory compliance: A multi-agent solution using Amazon Bedrock and CrewAI

Financial institutions today face an increasingly complex regulatory world that demands robust, efficient compliance mechanisms. Although organizations traditionally invest countless hours reviewing regulations such as the Anti-Money Laundering (AML) rules and the Bank Secrecy Act (BSA), modern AI solutions offer a transformative approach to this challenge. By using Amazon Bedrock Knowledge Bases alongside CrewAI—an open … Read more

Pixtral Large is now available in Amazon Bedrock

Today, we are excited to announce that Mistral AI’s Pixtral Large foundation model (FM) is generally available in Amazon Bedrock. With this launch, you can now access Mistral’s frontier-class multimodal model to build, experiment, and responsibly scale your generative AI ideas on AWS. AWS is the first major cloud provider to deliver Pixtral Large as … Read more