Total relevant papers: 3
Paper selection prompt and criteria at the bottom
Table of contents with paper titles:
DEBATE: Devil's Advocate-Based Assessment and Text Evaluation Authors: Alex Kim, Keonwoo Kim, Sangwon Yoon
Spectral Editing of Activations for Large Language Model Alignment Authors: Yifu Qiu, Zheng Zhao, Yftah Ziser, Anna Korhonen, Edoardo M. Ponti, Shay B. Cohen
Revisiting OPRO: The Limitations of Small-Scale LLMs as Optimizers Authors: Tuo Zhang, Jinyue Yuan, Salman Avestimehr
ArXiv ID: 2405.09935 Authors: Alex Kim, Keonwoo Kim, Sangwon Yoon
Abstract: arXiv:2405.09935v1 Announce Type: new Abstract: As natural language generation (NLG) models have become prevalent, systematically assessing the quality of machine-generated texts has become increasingly important. Recent studies introduce LLM-based evaluators that operate as reference-free metrics, demonstrating their capability to adeptly handle novel tasks. However, these models generally rely on a single-agent approach, which, we argue, introduces an inherent limit to their performance. This is because there exist biases in LLM agent's responses, including preferences for certain text structure or content. In this work, we propose DEBATE, an NLG evaluation framework based on multi-agent scoring system augmented with a concept of Devil's Advocate. Within the framework, one agent is instructed to criticize other agents' arguments, potentially resolving the bias in LLM agent's answers. DEBATE substantially outperforms the previous state-of-the-art methods in two meta-evaluation benchmarks in NLG evaluation, SummEval and TopicalChat. We also show that the extensiveness of debates among agents and the persona of an agent can influence the performance of evaluators.
Comment: Matches criterion 3 by proposing a new paradigm for evaluating NLG through a multi-agent scoring system, which is a novel approach to handling subjectivity in language model evaluation. Relevance: 8 Novelty: 7
ArXiv ID: 2405.09719 Authors: Yifu Qiu, Zheng Zhao, Yftah Ziser, Anna Korhonen, Edoardo M. Ponti, Shay B. Cohen
Abstract: arXiv:2405.09719v1 Announce Type: new Abstract: Large language models (LLMs) often exhibit undesirable behaviours, such as generating untruthful or biased content. Editing their internal representations has been shown to be effective in mitigating such behaviours on top of the existing alignment methods. We propose a novel inference-time editing method, namely spectral editing of activations (SEA), to project the input representations into directions with maximal covariance with the positive demonstrations (e.g., truthful) while minimising covariance with the negative demonstrations (e.g., hallucinated). We also extend our method to non-linear editing using feature functions. We run extensive experiments on benchmarks concerning truthfulness and bias with six open-source LLMs of different sizes and model families. The results demonstrate the superiority of SEA in effectiveness, generalisation to similar tasks, as well as inference and data efficiency. We also show that SEA editing only has a limited negative impact on other model capabilities.
Comment: This paper does not directly match any of the specified criteria but is related to improving the behavior of large language models through a novel method, which might interest someone looking for clever practical tricks in language models. Relevance: 3 Novelty: 7
ArXiv ID: 2405.10276 Authors: Tuo Zhang, Jinyue Yuan, Salman Avestimehr
Abstract: arXiv:2405.10276v1 Announce Type: new Abstract: Numerous recent works aim to enhance the efficacy of Large Language Models (LLMs) through strategic prompting. In particular, the Optimization by PROmpting (OPRO) approach provides state-of-the-art performance by leveraging LLMs as optimizers where the optimization task is to find instructions that maximize the task accuracy. In this paper, we revisit OPRO for automated prompting with relatively small-scale LLMs, such as LLaMa-2 family and Mistral 7B. Our investigation reveals that OPRO shows limited effectiveness in small-scale LLMs, with limited inference capabilities constraining optimization ability. We suggest future automatic prompting engineering to consider both model capabilities and computational costs. Additionally, for small-scale LLMs, we recommend direct instructions that clearly outline objectives and methodologies as robust prompt baselines, ensuring efficient and effective prompt engineering in ongoing research.
Comment: This paper does not closely match any of the specified criteria. It discusses the limitations of small-scale LLMs in the context of optimization by prompting, which is related to improving LLMs' performance but does not specifically address new methodological improvements to RLHF or instruction-following, test set contamination, evaluating open-ended text generation, real-world usage and safety properties, or scaling laws. Relevance: 3 Novelty: 5
In suggesting papers to your friend, remember that he likes learning about surprising empirical results in language models, as well as clever practical tricks. He does not want to read papers that are primarily about applications of methods to the medical or law domains.