Skip to main content

Vocabulary of natural language processing

Search from vocabulary

Concept information

Preferred term

CANINE  

Definition

  • A neural encoder that operates directly on character sequences””without explicit tokenization or vocabulary””and a pre-training strategy that operates either directly on characters or optionally uses subwords as a soft inductive bias. (Clark et al., 2021).

Broader concept

Example

  • Interestingly we find that CANINE performs poorly compared to mBERT in all three tasks and the performance gap increases as fewer finetuning data are available. (Sun, Fernandes, Wang & Neubig, 2023)
  • Since the syllables are separated by blank spaces in the melody-lyrics dataset the lyrics it generates are different from the correctly formatted language that CANINE is used to. (Zhang, Lasocki, Yu & Takasu, 2024)
  • We chose CANINE as it is a widely recognized open-source character-level language model. (Zhang, Lasocki, Yu & Takasu, 2024)
  • While producing longer sequences than mBERT CANINE does not necessarily incur higher memory or latency costs as it has fewer parameters than mBERT. (Sun, Fernandes, Wang & Neubig, 2023)

In other languages

URI

http://data.loterre.fr/ark:/67375/8LP-SNDVCWPW-4

Download this concept:

RDF/XML TURTLE JSON-LD Last modified 5/3/24