Unlocking Generative Power: Multi-Token Prediction for Next-Gen LLMs

Table of Links Abstract and 1. Introduction 2. Method 3. Experiments on real data 4. Ablations on synthetic data 5. Why does it work? Some speculation 6. Related work 7. Conclusion, Impact statement, Environmental impact, Acknowledgements and References A. Additional results on self-speculative decoding B. Alternative architectures C. Training speeds D. Finetuning E. Additional results … Read more

Defining the Frontier: Multi-Token Prediction’s Place in LLM Evolution

Table of Links Abstract and 1. Introduction 2. Method 3. Experiments on real data 4. Ablations on synthetic data 5. Why does it work? Some speculation 6. Related work 7. Conclusion, Impact statement, Environmental impact, Acknowledgements and References A. Additional results on self-speculative decoding B. Alternative architectures C. Training speeds D. Finetuning E. Additional results … Read more