FrontierMath: A benchmark for evaluating advanced mathematical reasoning in AI November 9, 2024 by Comments