The Training Technique That Teaches AI to Think, Not Memorize
Table of Links Abstract and 1. Introduction 1.1 Syllogisms composition 1.2 Hardness of long compositions 1.3 Hardness of global reasoning 1.4 Our contributions Results on the local reasoning barrier 2.1 Defining locality and auto-regressive locality 2.2 Transformers require low locality: formal results 2.3 Agnostic scratchpads cannot break the locality Scratchpads to break the locality 3.1 … Read more