Our Favorite Word Is the One That Makes Us The Dumbest

Yesterday, someone reminded me of a thing I once said… “I don’t believe in love.” Which might sound terrible. Or ridiculously nihilistic. Sometimes, I intellectually have a deer-in-the-headlights moment when confronted with how my opinions and outlook on life have changed over time. We were slightly different people a year ago, and we certainly were five or … Read more

Speed up your AI inference workloads with new NVIDIA-powered capabilities in Amazon SageMaker

This post is co-written with Abhishek Sawarkar, Eliuth Triana, Jiahong Liu and Kshitiz Gupta from NVIDIA.  At re:Invent 2024, we are excited to announce new capabilities to speed up your AI inference workloads with NVIDIA accelerated computing and software offerings on Amazon SageMaker. These advancements build upon our collaboration with NVIDIA, which includes adding support … Read more

Unlock cost savings with the new scale down to zero feature in SageMaker Inference

Today at AWS re:Invent 2024, we are excited to announce a new feature for Amazon SageMaker inference endpoints: the ability to scale SageMaker inference endpoints to zero instances. This long-awaited capability is a game changer for our customers using the power of AI and machine learning (ML) inference in the cloud. Previously, SageMaker inference endpoints … Read more

Supercharge your auto scaling for generative AI inference – Introducing Container Caching in SageMaker Inference

Today at AWS re:Invent 2024, we are excited to announce the new Container Caching capability in Amazon SageMaker, which significantly reduces the time required to scale generative AI  models for inference. This innovation allows you to scale your models faster, observing up to 56% reduction in latency when scaling a new model copy and up … Read more

Introducing Fast Model Loader in SageMaker Inference: Accelerate autoscaling for your Large Language Models (LLMs) – part 1

The generative AI landscape has been rapidly evolving, with large language models (LLMs) at the forefront of this transformation. These models have grown exponentially in size and complexity, with some now containing hundreds of billions of parameters and requiring hundreds of gigabytes of memory. As LLMs continue to expand, AI engineers face increasing challenges in … Read more

Introducing Fast Model Loader in SageMaker Inference: Accelerate autoscaling for your Large Language Models (LLMs) – Part 2

In Part 1 of this series, we introduced Amazon SageMaker Fast Model Loader, a new capability in Amazon SageMaker that significantly reduces the time required to deploy and scale large language models (LLMs) for inference. We discussed how this innovation addresses one of the major bottlenecks in LLM deployment: the time required to load massive models … Read more

Agile Is Rigor Mortis as Software’s State Religion

It hasn’t been uncommon in the software engineering world to hear proclamations of the death of Agile from people with a vested interest in the methodology, claiming that unless we show greater belief, it will be lost, and in cases where it fails, it merely hasn’t been implemented properly – reminiscent of other claims following … Read more

GOG’s preservation program lets you keep playing games after they’re delisted

Image: Blizzard GOG has announced that even if games in its recently launched preservation program are delisted from its store, it will maintain compatibility with those games and offer players “a seamless experience and tech support for those titles.” The first games covered are Warcraft I and II, scheduled for delisting on December 13th. That … Read more

Forget Inflation! In the Future, AI Will Naturally Collude to Raise Prices

Imagine asking ChatGPT to help set prices for your business. Now, imagine it quietly coordinating with your competitors’ AIs to drive up prices — all without being told to do so, and even without you seeing it. This isn’t science fiction — it’s happening right now in research labs, and one of the top papers on AImodels.fyi shows how … Read more