Concept information
Preferred term
reinforcement learning from human feedback
Broader concept
Synonym(s)
- RLHF
Example
- In this study we introduced the Token-Level Continuous Reward (TLCR) a novel reward model aimed at providing detailed token-based continuous rewards for Reinforcement Learning from Human Feedback (RLHF). (Yoon, Yoon, Eom, Han, Nam, Jo, On, Hasegawa-Johnson, Kim & Yoo, 2024)
- Meanwhile the performance of RLHF highly relies on the quality of its human preference annotations. (Lou, Zhang & Yin, 2024)
- One common method to reduce harmful outputs is reinforcement learning with human feedback (RLHF) (Zhan, Fang, Bindu, Gupta, Hashimoto & Kang, 2024)
- The first step of RLHF is to obtain an initial LM which is usually trained with the flatten-and-concatenation-based modeling strategy-concatenate instruction input and all other resources (if they exist) into one input sequence and train the LM to generate the ground-truth output (as we have introduced before). (Lou, Zhang & Yin, 2024)
- The OpenAI GPT-series adopt RLHF to align the model's preference with human instructions where feedback supervision plays a big role. (Lou, Zhang & Yin, 2024)
In other languages
URI
http://data.loterre.fr/ark:/67375/8LP-Z2DL85DC-R
{{label}}
{{#each values }} {{! loop through ConceptPropertyValue objects }}
{{#if prefLabel }}
{{/if}}
{{/each}}
{{#if notation }}{{ notation }} {{/if}}{{ prefLabel }}
{{#ifDifferentLabelLang lang }} ({{ lang }}){{/ifDifferentLabelLang}}
{{#if vocabName }}
{{ vocabName }}
{{/if}}