Tailoring Textual Resources for Evaluation Tasks: Summary

:::info
Author:

(1) Mingda Chen.

:::

Table of Links

Abstract

Acknowledgements

1 INTRODUCTION

1.1 Overview

1.2 Contributions

2 BACKGROUND

2.1 Self-Supervised Language Pretraining

2.2 Naturally-Occurring Data Structures

2.3 Sentence Variational Autoencoder

2.4 Summary

3 IMPROVING SELF-SUPERVISION FOR LANGUAGE PRETRAINING

3.1 Improving Language Representation Learning via Sentence Ordering Prediction

3.2 Improving In-Context Few-Shot Learning via Self-Supervised Training

3.3 Summary

4 LEARNING SEMANTIC KNOWLEDGE FROM WIKIPEDIA

4.1 Learning Entity Representations from Hyperlinks

4.2 Learning Discourse-Aware Sentence Representations from Document Structures

4.3 Learning Concept Hierarchies from Document Categories

4.4 Summary

5 DISENTANGLING LATENT REPRESENTATIONS FOR INTERPRETABILITY AND CONTROLLABILITY

5.1 Disentangling Semantics and Syntax in Sentence Representations

5.2 Controllable Paraphrase Generation with a Syntactic Exemplar

5.3 Summary

6 TAILORING TEXTUAL RESOURCES FOR EVALUATION TASKS

6.1 Long-Form Data-to-Text Generation

6.2 Long-Form Text Summarization

6.3 Story Generation with Constraints

6.4 Summary

7 CONCLUSION

APPENDIX A – APPENDIX TO CHAPTER 3

APPENDIX B – APPENDIX TO CHAPTER 6

BIBLIOGRAPHY

6.4 Summary

In this chapter, we showed that naturally-occurring textual resources can be tailored to build datasets for long-form data-to-text generation, long-form text summarization, and story generation with constraints. For each dataset, we conducted experiments to characterize the challenges in these new datasets. We also proposed new (either automatic or human-evaluation) metrics and models for these tasks to promote research in these directions.

:::info
This paper is available on arxiv under CC 4.0 license.

:::

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.