@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix dc: <http://purl.org/dc/terms/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix inist: <http://www.inist.fr/Ontology#> .

<http://data.loterre.fr/ark:/67375/8LP-ZXD5P7TL-7>
  skos:prefLabel "machine learning approach"@en, "approche d'apprentissage automatique"@fr ;
  a skos:Concept ;
  skos:narrower <http://data.loterre.fr/ark:/67375/8LP-B7HRCSC1-T> .

<http://data.loterre.fr/ark:/67375/8LP> a owl:Ontology, skos:ConceptScheme .
<http://data.loterre.fr/ark:/67375/8LP-B7HRCSC1-T>
  skos:narrower <http://data.loterre.fr/ark:/67375/8LP-NV6KJVPN-C>, <http://data.loterre.fr/ark:/67375/8LP-Z2DL85DC-R> ;
  skos:prefLabel "reinforcement learning"@en, "apprentissage par renforcement"@fr ;
  skos:example "L'apprentissage par renforcement nécessite donc le calcul de ces récompenses qui se fait généralement à partir d'annotations. (Zhao & Bernard, 2023)"@fr, "Before reinforcement learning (RL) supervised learning (SL) is applied to mimic dialogues provided by a rule-based system. (Wang, Zhang, Li, Zong & Li, 2019)"@en, "During the action sampling step in RL we reduce the search space of actions based on the constitution of the previous word contexts as well as our n-gram model. (Guo, Chang, Yu & Bai, 2018)"@en, "This model is developed using the previous model by adding mixed training of both machine learning and reinforcement learning. (Firdaus, Ekbal & Bhattacharyya, 2020)"@en, "L'apprentissage par renforcement ou reinforcement learning (RL) peut être utilisé pour modéliser le DM en particulier pour apprendre une politique qui maximise la probabilité d'atteindre l'objectif de la conversation. (Thibault Cordier, Fabrice Lefèvre, Tanguy Urvoy & Lina Maria Rojas-Barahona, 2022)"@fr, "Some recent studies have incorporated RL to align LMs with human preference and to prompt LM for problem-solving (see Table 1 for details). (Zhou, Du & Li, 2024)"@en ;
  a skos:Concept ;
  skos:inScheme <http://data.loterre.fr/ark:/67375/8LP> ;
  skos:altLabel "reinforcement machine learning"@en, "RL"@fr, "RL"@en ;
  skos:hiddenLabel "Reinforcement learning"@en, "Apprentissage par renforcement"@fr ;
  skos:definition "A subset of machine learning that allows an AI-driven system (sometimes referred to as an agent) to learn through trial and error using feedback from its actions. This feedback is either negative or positive, signalled as punishment or reward with, of course, the aim of maximising the reward function. (University of York website)"@en, "Procédé d'apprentissage automatique consistant, pour un système autonome, à apprendre les actions à réaliser, à partir d'expériences, de façon à optimiser une récompense quantitative au cours du temps. (CNIL)"@fr ;
  skos:broader <http://data.loterre.fr/ark:/67375/8LP-ZXD5P7TL-7> ;
  dc:modified "2024-05-29T06:35:11"^^xsd:dateTime ;
  inist:definitionalContext "L'apprentissage par renforcement (AR) (Sutton et Barto 1998) est un paradigme général d'apprentissage automatique qui a pour but de résoudre des problèmes de prise de décisions séquentielles. (Daubigney, Geist & Pietquin, 2012)"@fr ;
  skos:exactMatch <https://www.wikidata.org/wiki/Q830687> .

<http://data.loterre.fr/ark:/67375/8LP-Z2DL85DC-R>
  skos:prefLabel "reinforcement learning from human feedback"@en, "apprentissage par renforcement à partir de rétroaction humaine"@fr ;
  a skos:Concept ;
  skos:broader <http://data.loterre.fr/ark:/67375/8LP-B7HRCSC1-T> .

<http://data.loterre.fr/ark:/67375/8LP-NV6KJVPN-C>
  skos:prefLabel "rétroaction"@fr, "feedback"@en ;
  a skos:Concept ;
  skos:broader <http://data.loterre.fr/ark:/67375/8LP-B7HRCSC1-T> .