Concept information
Término preferido
self-attention layer
Definición
- A layer in the architecture of transformer-based models that allows the model to focus on different parts of the input sequence when processing each token to capture contextual relationships and dependencies within the input sequence.
Concepto genérico
Ejemplo
- Each encoder has its own self-attention layer and feed-forward layer to process each input separately. (Shin & Lee, 2018)
- First encoder self-attention layers benefit most from additive window attention while decoder self-attention layers prefer multiplicative attention. (Nguyen, Nguyen, Joty & Li, 2020)
- Specifically we distill the knowledge from the hidden state of each transformer block and the attention score of each self-attention layer. (Li, Gao, Lei & Xu, 2023)
- We are motivated to improve the self-attention layer appended to the top of the transformer encoder to enrich the contextualized word representation with information from its neighbors and the relations from the dependency parse trees. (Galitsky, Ilvovsky & Goncharova, 2021)
- We then apply a self-attention layer to model the guiding effect of ontology knowledge on the extraction of entities and relations from the sentence. (Xiong, Chen, Yunfei & Shengyang, 2023)
En otras lenguas
-
francés
URI
http://data.loterre.fr/ark:/67375/8LP-DMDVS16W-4
{{label}}
{{#each values }} {{! loop through ConceptPropertyValue objects }}
{{#if prefLabel }}
{{/if}}
{{/each}}
{{#if notation }}{{ notation }} {{/if}}{{ prefLabel }}
{{#ifDifferentLabelLang lang }} ({{ lang }}){{/ifDifferentLabelLang}}
{{#if vocabName }}
{{ vocabName }}
{{/if}}