Skip to main content

Vocabulary of natural language processing

Search from vocabulary

Concept information

Preferred term

reinforcement learning from human feedback  

Broader concept

Synonym(s)

  • RLHF

Example

  • In this study we introduced the Token-Level Continuous Reward (TLCR) a novel reward model aimed at providing detailed token-based continuous rewards for Reinforcement Learning from Human Feedback (RLHF). (Yoon, Yoon, Eom, Han, Nam, Jo, On, Hasegawa-Johnson, Kim & Yoo, 2024)
  • Meanwhile the performance of RLHF highly relies on the quality of its human preference annotations. (Lou, Zhang & Yin, 2024)
  • One common method to reduce harmful outputs is reinforcement learning with human feedback (RLHF) (Zhan, Fang, Bindu, Gupta, Hashimoto & Kang, 2024)
  • The first step of RLHF is to obtain an initial LM which is usually trained with the flatten-and-concatenation-based modeling strategy-concatenate instruction input and all other resources (if they exist) into one input sequence and train the LM to generate the ground-truth output (as we have introduced before). (Lou, Zhang & Yin, 2024)
  • The OpenAI GPT-series adopt RLHF to align the model's preference with human instructions where feedback supervision plays a big role. (Lou, Zhang & Yin, 2024)

URI

http://data.loterre.fr/ark:/67375/8LP-Z2DL85DC-R

Download this concept:

RDF/XML TURTLE JSON-LD Created 10/10/24, last modified 10/10/24