Skip to main content

Vocabulary of natural language processing

Search from vocabulary

Concept information

NLP methods and tools > signal processing > language identifier

Preferred term

language identifier  

Definition

  • A piece of software for the automatic recognition of the language of a document. (Adapted from Rajesh et al., Recognizing the languages in WebPages-A framework for NLP, 2013)

Broader concept

Example

  • However even the best language identifiers do not give perfect results when dealing with a large number of languages out-of-domain texts or short texts. (Jauhiainen, Lindén & Jauhiainen, 2017)
  • State of the art language identifiers obtain high rates in both recall and precision. (Jauhiainen, Lindén & Jauhiainen, 2017)
  • The language identifier created by Brown (2012) "whatlang" obtains 99.2% classification accuracy with smoothing for 65 character test strings when distinguishing between 1100 languages (Brown 2013; Brown 2014). (Jauhiainen, Lindén & Jauhiainen, 2017)
  • The language identifier itself can be utilized as a tool to pre-filter a new database in order to refine the probability table. (Vitale, 1991)

In other languages

URI

http://data.loterre.fr/ark:/67375/8LP-FPHMHWZ6-4

Download this concept:

RDF/XML TURTLE JSON-LD Last modified 4/26/24