Skip to main content

Vocabulary of natural language processing

Search from vocabulary

Concept information

Preferred term

stemmer  

Definition

  • An algorithm used in information retrieval, text mining, machine translation, summarization and classification to reduce the query words to a basic standardized form. (Adapted from Jabbar et al., A comparative review of Urdu stemmers: Approaches and challenges, in Computer Science Review, 2019)

Broader concept

Synonym(s)

  • stemming algorithm

Example

  • Simple truncation may be a good option for other languages where stemmers are not readily available. (Bergsma, Lin & Goebel, 2008)
  • The Lithuanian stemmer eliminates all endings and also some suffixes (but do not touch prefixes). (Kapočiute-Dzikiene, N\\oklestad, Johannessen & Krupavičius, 2013)
  • The observations obtained from the experiments allow us to conclude that simple automatic prefix and suffix information capture enough of the Lithuanian language inflection information; therefore it can replace features generated by part-of-speech taggers lemmatizers and stemmers. (Kapočiute-Dzikiene, N\\oklestad, Johannessen & Krupavičius, 2013)
  • The stemmer is rule-based thus it can cope with proper names as well as with the other words. (Kapočiute-Dzikiene, N\\oklestad, Johannessen & Krupavičius, 2013)
  • We also assess whether it is possible to perform named entity classification effectively without resorting to external grammatical tools such as part-of-speech taggers lemmatizers or stemmers because they have limited availability and are slow and unreliable when identifying proper names in Lithuanian texts. (Kapočiute-Dzikiene, N\\oklestad, Johannessen & Krupavičius, 2013)

In other languages

URI

http://data.loterre.fr/ark:/67375/8LP-KSHXVMPD-M

Download this concept:

RDF/XML TURTLE JSON-LD Last modified 5/3/24