: Vocabulario de PLN: speech recognition

applied NLP > speech processing > speech recognition

speech recognition

Conversion, by a functional unit, of a speech signal to a representation of the content of the speech. (ISO/IEC 22989:2022 Information technology — Artificial intelligence — Artificial intelligence concepts and terminology)

Automatic Speech Recognition (ASR) on resource constrained environment is a complex task since most of the State-Of-The-Art models are combination of multilayered convolutional neural network (CNN) and Transformer models which itself requires huge resources such as GPU or TPU for training as well as inference. (Haswani & Mohankumar, 2022)
However currently available STT systems do not recognize word fragments. (Kim, Schwarm & Ostendorf, 2004)
Jaitly et al. (2016) propose an online neural transducer for speech recognition that is conditioned on prefixes. (Ma, Huang, Xiong, Zheng, Liu, Zheng, Zhang, He, Liu, Li, Wu & Wang, 2019)
Since the accuracy of speech recognition is not near to perfect it will cause the natural language misunderstanding. (Yeh & Chu, 2013)
Therefore we made use of the recently-developed cloud-based speech-to-text (STT) service for automatically generating transcriptions that would help annotators do their tasks. (Saito, Takamichi & Saruwatari, 2020)

http://data.loterre.fr/ark:/67375/8LP-K1CXLSF5-M

RDF/XML TURTLE JSON-LD última modificación 19/9/24

Loterre