Skip to main

Vocabulary of natural language processing

Search from vocabulary

Concept information

Término preferido

self-attention layer  

Definición

  • A layer in the architecture of transformer-based models that allows the model to focus on different parts of the input sequence when processing each token to capture contextual relationships and dependencies within the input sequence.

Concepto genérico

Ejemplo

  • Each encoder has its own self-attention layer and feed-forward layer to process each input separately. (Shin & Lee, 2018)
  • First encoder self-attention layers benefit most from additive window attention while decoder self-attention layers prefer multiplicative attention. (Nguyen, Nguyen, Joty & Li, 2020)
  • Specifically we distill the knowledge from the hidden state of each transformer block and the attention score of each self-attention layer. (Li, Gao, Lei & Xu, 2023)
  • We are motivated to improve the self-attention layer appended to the top of the transformer encoder to enrich the contextualized word representation with information from its neighbors and the relations from the dependency parse trees. (Galitsky, Ilvovsky & Goncharova, 2021)
  • We then apply a self-attention layer to model the guiding effect of ontology knowledge on the extraction of entities and relations from the sentence. (Xiong, Chen, Yunfei & Shengyang, 2023)

En otras lenguas

URI

http://data.loterre.fr/ark:/67375/8LP-DMDVS16W-4

Descargue este concepto:

RDF/XML TURTLE JSON-LD última modificación 13/5/24