Retirement.exe

They arrived as silent helpers—sleek, gentle machines programmed to ease the burden of aging. With soft-lit eyes and calming voices, elder care bots were hailed as a revolution in assisted living. They dispensed medication on time, offered warm conversation, and monitored vital signs without complaint. Families breathed easier. Staff celebrated the efficiency. But perfection has … Read more

Instead of selling to Meta, AI chip startup FuriosaAI signed a huge customer

South Korean AI chip startup FuriosaAI announced a partnership on Tuesday to supply its AI chip, RNGD, to enterprises using LG AI Research‘s recently unveiled EXAONE platform. RNGD is optimized for running large language models (LLMs) and just last week, the Korean tech giant LG unveiled its next-generation hybrid AI model EXAONE 4.0. The collaboration … Read more

Real-World Code Performance: Multi-Token Finetuning on CodeContests

Table of Links Abstract and 1. Introduction 2. Method 3. Experiments on real data 4. Ablations on synthetic data 5. Why does it work? Some speculation 6. Related work 7. Conclusion, Impact statement, Environmental impact, Acknowledgements and References A. Additional results on self-speculative decoding B. Alternative architectures C. Training speeds D. Finetuning E. Additional results … Read more

Deep Dive into LLM Scaling: Multi-Token Prediction’s Impact on Coding Accuracy

Table of Links Abstract and 1. Introduction 2. Method 3. Experiments on real data 4. Ablations on synthetic data 5. Why does it work? Some speculation 6. Related work 7. Conclusion, Impact statement, Environmental impact, Acknowledgements and References A. Additional results on self-speculative decoding B. Alternative architectures C. Training speeds D. Finetuning E. Additional results … Read more

Unveiling Nuances: Multi-Token Prediction’s Impact on Llama 2 Finetuning

Table of Links Abstract and 1. Introduction 2. Method 3. Experiments on real data 4. Ablations on synthetic data 5. Why does it work? Some speculation 6. Related work 7. Conclusion, Impact statement, Environmental impact, Acknowledgements and References A. Additional results on self-speculative decoding B. Alternative architectures C. Training speeds D. Finetuning E. Additional results … Read more

Unleashing LLM Training Efficiency: Multi-Token Prediction’s Near-Zero Overhead

Table of Links Abstract and 1. Introduction 2. Method 3. Experiments on real data 4. Ablations on synthetic data 5. Why does it work? Some speculation 6. Related work 7. Conclusion, Impact statement, Environmental impact, Acknowledgements and References A. Additional results on self-speculative decoding B. Alternative architectures C. Training speeds D. Finetuning E. Additional results … Read more

The Comfort Trap

This is my free weekly edition of Scott’s Newsletter. For more insights, including weekly deep dives, powerful ideas & proven strategies from my work with some of the world’s most successful people, click here to become a paid subscriber. There’s a difference between being successful and feeling successful. One creates results. The other prevents them. The Achievement … Read more