Sophia: Scalable Stochastic 2nd-Order Optimizer for Language Model Pre-Training April 7, 2024 by Comments