30% drop in O1-preview accuracy when Putnam problems are slightly variated January 1, 2025 by Comments