:::info
Author:
(1) Mingda Chen.
:::
Table of Links
2.1 Self-Supervised Language Pretraining
2.2 Naturally-Occurring Data Structures
2.3 Sentence Variational Autoencoder
3 IMPROVING SELF-SUPERVISION FOR LANGUAGE PRETRAINING
3.1 Improving Language Representation Learning via Sentence Ordering Prediction
3.2 Improving In-Context Few-Shot Learning via Self-Supervised Training
4 LEARNING SEMANTIC KNOWLEDGE FROM WIKIPEDIA
4.1 Learning Entity Representations from Hyperlinks
4.2 Learning Discourse-Aware Sentence Representations from Document Structures
4.3 Learning Concept Hierarchies from Document Categories
5 DISENTANGLING LATENT REPRESENTATIONS FOR INTERPRETABILITY AND CONTROLLABILITY
5.1 Disentangling Semantics and Syntax in Sentence Representations
5.2 Controllable Paraphrase Generation with a Syntactic Exemplar
6 TAILORING TEXTUAL RESOURCES FOR EVALUATION TASKS
6.1 Long-Form Data-to-Text Generation
6.2 Long-Form Text Summarization
6.3 Story Generation with Constraints
APPENDIX A – APPENDIX TO CHAPTER 3
APPENDIX B – APPENDIX TO CHAPTER 6
CHAPTER 5 – DISENTANGLING LATENT REPRESENTATIONS FOR INTERPRETABILITY AND CONTROLLABILITY
In this chapter, we describe our contributions to disentangling latent representations using naturally-occurring structures of paired data. In Section 5.1, we presented a multi-task, latent-variable model that disentangles semantics and syntax in sentence representations. The model leverages the fact that the semantics of a paraphrase pair is shared but syntax varies. In Section 5.2, we extend this framework for controlling the syntax of generated text. In this controlled generation setting, we propose to use a sentential exemplar to control the syntax.
The material in this chapter is adapted from Chen et al. (2019d) and Chen et al. (2019c).
:::info
This paper is available on arxiv under CC 4.0 license.
:::