Passer au contenu principal

Vocabulary of natural language processing

Choisissez le vocabulaire dans lequel chercher

Concept information

Terme préférentiel

self-attention layer  

Définition

  • A layer in the architecture of transformer-based models that allows the model to focus on different parts of the input sequence when processing each token to capture contextual relationships and dependencies within the input sequence.

Concept générique

Exemple

  • Each encoder has its own self-attention layer and feed-forward layer to process each input separately. (Shin & Lee, 2018)
  • First encoder self-attention layers benefit most from additive window attention while decoder self-attention layers prefer multiplicative attention. (Nguyen, Nguyen, Joty & Li, 2020)
  • Specifically we distill the knowledge from the hidden state of each transformer block and the attention score of each self-attention layer. (Li, Gao, Lei & Xu, 2023)
  • We are motivated to improve the self-attention layer appended to the top of the transformer encoder to enrich the contextualized word representation with information from its neighbors and the relations from the dependency parse trees. (Galitsky, Ilvovsky & Goncharova, 2021)
  • We then apply a self-attention layer to model the guiding effect of ontology knowledge on the extraction of entities and relations from the sentence. (Xiong, Chen, Yunfei & Shengyang, 2023)

Traductions

URI

http://data.loterre.fr/ark:/67375/8LP-DMDVS16W-4

Télécharger ce concept :

RDF/XML TURTLE JSON-LD Dernière modification le 13/05/2024