Skip to main

Vocabulary of natural language processing

Search from vocabulary

Concept information

Término preferido

text clustering  

Definición

  • A type of unsupervised learning that consists in grouping texts according to their level of similarity. (Loterre)

Concepto genérico

Etiquetas alternativas

  • document clustering

Contexto(s) definitorio(s)

  • Text document clustering is the grouping of text documents into semantically related groups or as Hayes puts it "they are grouped because they are likely to be wanted together" (Hayes 1963). (Sedding & Kazakov, 2004)

Ejemplo

  • Initially document clustering was developed to improve precision and recall of information retrieval systems. (Sedding & Kazakov, 2004)
  • Moreover the quality of text clustering is intricately related to user preference which is hard to describe using a textual prompt. (Zhang, Zou, Yi & Aw, 2024)
  • Text document clustering can greatly simplify browsing large collections of documents by reorganizing them into a smaller number of manageable clusters. (Sedding & Kazakov, 2004)
  • They conducted clinical document clustering in 17 different clinical domains and showed that note types in a broad clinical scope form the same cluster but note types in a narrow clinical extent form different clusters. (Sohn, Clark, Halgrim, Murphy, Jonnalagadda, Wagholikar, Wu, Chute & Liu, 2013)
  • Toda and Kataoka (2005) use document clustering based on Named Entities to tackle the problem of document retrieval for search results. (Tsekouras, Petasis & Kosmopoulos, 2019)

En otras lenguas

URI

http://data.loterre.fr/ark:/67375/8LP-DK7HGDD5-3

Descargue este concepto:

RDF/XML TURTLE JSON-LD última modificación 21/5/24