Few-shot In-Context Preference Learning Using Large Language Models: Full Prompts and ICPL Details

Table of Links

A. Appendix

A.1. Full Prompts and A.2 ICPL Details

We would suggest visiting https://sites.google.com/view/few-shot-icpl/home for more information and videos.

Prompt 1: Initial System Prompts of Synthesizing Reward Functions

Prompt 2: Feedback Prompts

Prompt 3: Prompts of Tips for Writing Reward Functions

Prompt 4: Prompts of Describing Differences

The full pseudocode of ICPL is listed in Algo. 2.

:::info
Authors:

(1) Chao Yu, Tsinghua University;

(2) Hong Lu, Tsinghua University;

(3) Jiaxuan Gao, Tsinghua University;

(4) Qixin Tan, Tsinghua University;

(5) Xinting Yang, Tsinghua University;

(6) Yu Wang, with equal advising from Tsinghua University;

(7) Yi Wu, with equal advising from Tsinghua University and the Shanghai Qi Zhi Institute;

(8) Eugene Vinitsky, with equal advising from New York University (zoeyuchao@gmail.com).

:::

:::info
This paper is available on arxiv under CC 4.0 license.

:::