Skip to main content

Vocabulary of natural language processing

Search from vocabulary

Concept information

Preferred term

text2vec  

Definition

  • A fast and memory-friendly tool for text vectorization, topic modeling (LDA, LSA), word embeddings (GloVe), similarities. This package provides a source-agnostic streaming API, which allows researchers to perform analysis of collections of documents which are larger than available RAM. All core functions are parallelized to benefit from multicore machines. (Loterre)

Broader concept

Example

  • All test and training texts were lower-cased and transformed into document-term matrices using the text2vec package for R (Selivanov and Wang 2017). (Caines, Pastrana, Hutchings & Buttery, 2018)
  • Contriever uses the text2vec model (Xu 2023) from Hugging Face to calculate the similarity between memory sentences and questions. (Du, Wang, Zhao, Liang, Wang, Zhong, Wang & Wong, 2024)
  • Text2Vec measuring paper similarity and scientific paper understanding. (Wang, Liu & Wang, 2022)
  • We next used the Text2Vec task to illustrate how our dataset can be used to compare the performance of different pre-trained language models. (Wang, Liu & Wang, 2022)

In other languages

URI

http://data.loterre.fr/ark:/67375/8LP-TQ8DL36S-3

Download this concept:

RDF/XML TURTLE JSON-LD Last modified 5/3/24