: NLP Vocabulary: reinforcement learning from human feedback

... > artificial intelligence > machine learning > machine learning approach > reinforcement learning > reinforcement learning from human feedback

reinforcement learning from human feedback

In this study we introduced the Token-Level Continuous Reward (TLCR) a novel reward model aimed at providing detailed token-based continuous rewards for Reinforcement Learning from Human Feedback (RLHF). (Yoon, Yoon, Eom, Han, Nam, Jo, On, Hasegawa-Johnson, Kim & Yoo, 2024)
Meanwhile the performance of RLHF highly relies on the quality of its human preference annotations. (Lou, Zhang & Yin, 2024)
One common method to reduce harmful outputs is reinforcement learning with human feedback (RLHF) (Zhan, Fang, Bindu, Gupta, Hashimoto & Kang, 2024)
The first step of RLHF is to obtain an initial LM which is usually trained with the flatten-and-concatenation-based modeling strategy-concatenate instruction input and all other resources (if they exist) into one input sequence and train the LM to generate the ground-truth output (as we have introduced before). (Lou, Zhang & Yin, 2024)
The OpenAI GPT-series adopt RLHF to align the model's preference with human instructions where feedback supervision plays a big role. (Lou, Zhang & Yin, 2024)

http://data.loterre.fr/ark:/67375/8LP-Z2DL85DC-R

RDF/XML TURTLE JSON-LD Created 10/10/24, last modified 10/10/24

Loterre