Accelerate Mixtral 8x7B pre-training with expert parallelism on Amazon SageMaker

Mixture of Experts (MoE) architectures for large language models (LLMs) have recently gained popularity due to their ability to increase model capacity and computational efficiency compared to fully dense models. By utilizing sparse expert subnetworks that process different subsets of tokens, MoE models can effectively increase the number of parameters while requiring less computation per … Read more

Generating fashion product descriptions by fine-tuning a vision-language model with SageMaker and Amazon Bedrock

In the world of online retail, creating high-quality product descriptions for millions of products is a crucial, but time-consuming task. Using machine learning (ML) and natural language processing (NLP) to automate product description generation has the potential to save manual effort and transform the way ecommerce platforms operate. One of the main advantages of high-quality … Read more

Create a multimodal assistant with advanced RAG and Amazon Bedrock

Retrieval Augmented Generation (RAG) models have emerged as a promising approach to enhance the capabilities of language models by incorporating external knowledge from large text corpora. However, despite their impressive performance in various natural language processing tasks, RAG models still face several limitations that need to be addressed. Naive RAG models face limitations such as … Read more

How 20 Minutes empowers journalists and boosts audience engagement with generative AI on Amazon Bedrock

This post is co-written with Aurélien Capdecomme and Bertrand d’Aure from 20 Minutes. With 19 million monthly readers, 20 Minutes is a major player in the French media landscape. The media organization delivers useful, relevant, and accessible information to an audience that consists primarily of young and active urban readers. Every month, nearly 8.3 million 25–49-year-olds choose … Read more

Efficient and cost-effective multi-tenant LoRA serving with Amazon SageMaker

In the rapidly evolving landscape of artificial intelligence (AI), the rise of generative AI models has ushered in a new era of personalized and intelligent experiences. Organizations are increasingly using the power of these language models to drive innovation and enhance their services, from natural language processing to content generation and beyond. Using generative AI … Read more

How LotteON built a personalized recommendation system using Amazon SageMaker and MLOps

This post is co-written with HyeKyung Yang, Jieun Lim, and SeungBum Shim from LotteON. LotteON aims to be a platform that not only sells products, but also provides a personalized recommendation experience tailored to your preferred lifestyle. LotteON operates various specialty stores, including fashion, beauty, luxury, and kids, and strives to provide a personalized shopping … Read more

Build a serverless exam generator application from your own lecture content using Amazon Bedrock

Crafting new questions for exams and quizzes can be tedious and time-consuming for educators. The time required varies based on factors like subject matter, question types, experience level, and class level. Multiple-choice questions require substantial time to generate quality distractors and ensure a single unambiguous answer, and composing effective true-false questions demands careful effort to … Read more

Accelerate NLP inference with ONNX Runtime on AWS Graviton processors

ONNX is an open source machine learning (ML) framework that provides interoperability across a wide range of frameworks, operating systems, and hardware platforms. ONNX Runtime is the runtime engine used for model inference and training with ONNX. AWS Graviton3 processors are optimized for ML workloads, including support for bfloat16, Scalable Vector Extension (SVE), and Matrix … Read more

Learn how Amazon Ads created a generative AI-powered image generation capability using Amazon SageMaker

Amazon Ads helps advertisers and brands achieve their business goals by developing innovative solutions that reach millions of Amazon customers at every stage of their journey. At Amazon Ads, we believe that what makes advertising effective is delivering relevant ads in the right context and at the right moment within the consumer buying journey. With that … Read more

RAG architecture with Voyage AI embedding models on Amazon SageMaker JumpStart and Anthropic Claude 3 models

This post is a guest post co-written with Tengyu Ma and Wen Phan from Voyage AI. Organizations today have access to vast amounts of data, much of it proprietary, which holds the potential to unlock valuable insights when used effectively in generative artificial intelligence (AI) applications. Retrieval Augmented Generation (RAG) is a powerful technique designed … Read more

Incorporate offline and online human – machine workflows into your generative AI applications on AWS

Recent advances in artificial intelligence have led to the emergence of generative AI that can produce human-like novel content such as images, text, and audio. These models are pre-trained on massive datasets and, to sometimes fine-tuned with smaller sets of more task specific data. An important aspect of developing effective generative AI application is Reinforcement … Read more

Build generative AI applications with Amazon Titan Text Premier, Amazon Bedrock, and AWS CDK

Amazon Titan Text Premier, the latest addition to the Amazon Titan family of large language models (LLMs), is now generally available in Amazon Bedrock. Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading artificial intelligence (AI) companies like AI21 Labs, Anthropic, Cohere, Meta, Stability AI, and … Read more

AWS DeepRacer enables builders of all skill levels to upskill and get started with machine learning

In today’s technological landscape, artificial intelligence (AI) and machine learning (ML) are becoming increasingly accessible, enabling builders of all skill levels to harness their power. As more companies adopt AI solutions, there’s a growing need to upskill both technical and non-technical teams in responsibly expanding AI usage. Getting hands-on experience is crucial for understanding and … Read more

Transform customer engagement with no-code LLM fine-tuning using Amazon SageMaker Canvas and SageMaker JumpStart

Fine-tuning large language models (LLMs) creates tailored customer experiences that align with a brand’s unique voice. Amazon SageMaker Canvas and Amazon SageMaker JumpStart democratize this process, offering no-code solutions and pre-trained models that enable businesses to fine-tune LLMs without deep technical expertise, helping organizations move faster with fewer technical resources. SageMaker Canvas provides an intuitive … Read more

How LotteON built dynamic A/B testing for their personalized recommendation system

This post is co-written with HyeKyung Yang, Jieun Lim, and SeungBum Shim from LotteON. LotteON is transforming itself into an online shopping platform that provides customers with an unprecedented shopping experience based on its in-store and online shopping expertise. Rather than simply selling the product, they create and let customers experience the product through their … Read more

Unleashing the power of generative AI: Verisk’s journey to an Instant Insight Engine for enhanced customer support

This post is co-written with Tom Famularo, Abhay Shah and Nicolette Kontor from Verisk. Verisk (Nasdaq: VRSK) is a leading data analytics and technology partner for the global insurance industry. Through advanced analytics, software, research, and industry expertise across over 20 countries, Verisk helps build resilience for individuals, communities, and businesses. The company is committed … Read more

Establishing an AI/ML center of excellence

The rapid advancements in artificial intelligence and machine learning (AI/ML) have made these technologies a transformative force across industries. According to a McKinsey study, across the financial services industry (FSI), generative AI is projected to deliver over $400 billion (5%) of industry revenue in productivity benefits. As maintained by Gartner, more than 80% of enterprises … Read more

Build a Hugging Face text classification model in Amazon SageMaker JumpStart

Amazon SageMaker JumpStart provides a suite of built-in algorithms, pre-trained models, and pre-built solution templates to help data scientists and machine learning (ML) practitioners get started on training and deploying ML models quickly. You can use these algorithms and models for both supervised and unsupervised learning. They can process various types of input data, including … Read more

How Dialog Axiata used Amazon SageMaker to scale ML models in production with AI Factory and reduced customer churn within 3 months

The telecommunications industry is more competitive than ever before. With customers able to easily switch between providers, reducing customer churn is a crucial priority for telecom companies who want to stay ahead. To address this challenge, Dialog Axiata has pioneered a cutting-edge solution called the Home Broadband (HBB) Churn Prediction Model. This post explores the … Read more

Amazon SageMaker now integrates with Amazon DataZone to streamline machine learning governance

Amazon SageMaker is a fully managed machine learning (ML) service that provides a range of tools and features for building, training, and deploying ML models. Amazon DataZone is a data management service that makes it faster and easier for customers to catalog, discover, share, and govern data stored across AWS, on-premises, and third-party sources. Today, … Read more

Boost employee productivity with automated meeting summaries using Amazon Transcribe, Amazon SageMaker, and LLMs from Hugging Face

The prevalence of virtual business meetings in the corporate world, largely accelerated by the COVID-19 pandemic, is here to stay. Based on a survey conducted by American Express in 2023, 41% of business meetings are expected to take place in hybrid or virtual format by 2024. Attending multiple meetings daily and keeping track of all … Read more

How Veritone uses Amazon Bedrock, Amazon Rekognition, Amazon Transcribe, and information retrieval to update their video search pipeline

This post is co-written with Tim Camara, Senior Product Manager at Veritone. Veritone is an artificial intelligence (AI) company based in Irvine, California. Founded in 2014, Veritone empowers people with AI-powered software and solutions for various applications, including media processing, analytics, advertising, and more. It offers solutions for media transcription, facial recognition, content summarization, object … Read more

Information extraction with LLMs using Amazon SageMaker JumpStart

Large language models (LLMs) have unlocked new possibilities for extracting information from unstructured text data. Although much of the current excitement is around LLMs for generative AI tasks, many of the key use cases that you might want to solve have not fundamentally changed. Tasks such as routing support tickets, recognizing customers intents from a … Read more

AWS Inferentia and AWS Trainium deliver lowest cost to deploy Llama 3 models in Amazon SageMaker JumpStart

Today, we’re excited to announce the availability of Meta Llama 3 inference on AWS Trainium and AWS Inferentia based instances in Amazon SageMaker JumpStart. The Meta Llama 3 models are a collection of pre-trained and fine-tuned generative text models. Amazon Elastic Compute Cloud (Amazon EC2) Trn1 and Inf2 instances, powered by AWS Trainium and AWS … Read more

Revolutionize Customer Satisfaction with tailored reward models for your business on Amazon SageMaker

As more powerful large language models (LLMs) are used to perform a variety of tasks with greater accuracy, the number of applications and services that are being built with generative artificial intelligence (AI) is also growing. With great power comes responsibility, and organizations want to make sure that these LLMs produce responses that align with … Read more

Amazon Personalize launches new recipes supporting larger item catalogs with lower latency

Personalized customer experiences are essential for engaging today’s users. However, delivering truly personalized experiences that adapt to changes in user behavior can be both challenging and time-consuming. Amazon Personalize makes it straightforward to personalize your website, app, emails, and more, using the same machine learning (ML) technology used by Amazon, without requiring ML expertise. With … Read more

Get started with Amazon Titan Text Embeddings V2: A new state-of-the-art embeddings model on Amazon Bedrock

Embeddings are integral to various natural language processing (NLP) applications, and their quality is crucial for optimal performance. They are commonly used in knowledge bases to represent textual data as dense vectors, enabling efficient similarity search and retrieval. In Retrieval Augmented Generation (RAG), embeddings are used to retrieve relevant passages from a corpus to provide … Read more