HyperTransformer: D Dependence On Parameters and Ablation Studies
:::info This paper is available on arxiv under CC 4.0 license. Authors: (1) Andrey Zhmoginov, Google Research & {azhmogin,sandler,mxv}@google.com; (2) Mark Sandler, Google Research & {azhmogin,sandler,mxv}@google.com; (3) Max Vladymyrov, Google Research & {azhmogin,sandler,mxv}@google.com. ::: Table of Links Abstract and Introduction Problem Setup and Related Work HyperTransformer Experiments Conclusion and References A Example of a Self-Attention … Read more