Synchronizing text and audio. Is there a NLP/speech-to-text library to do this?

Tags:

I would like to synchronize a spoken recording against a known text. Is there a speech-to-text / natural language processing library that would facilitate this? I imagine I'd want to detect word boundaries and compute candidate matches from a dictionary. Most of the questions I've found on SO concern written language.

Desired, but not required:

Open Source
Compatible with American English out-of-the-box
Cross-platform
Thoroughly documented

Edit: I realize this is a very broad, even naive, question, so thanks in advance for your guidance.

What I've found so far:

OpenEars (iOS Sphinx/Flite wrapper)

831

asked Nov 01 '10 18:11

Justin

1 Answers

Forced Alignment

It sounds like you want to do forced alignment between your audio and the known text.

Pretty much all research/industry grade speech recognition systems will be able to do this, since forced alignment is an important part of training a recognition system on data that doesn't have phone level alignments between the audio and the transcript.

Alignment CMUSphinx

The Sphinx4-1.0 beta 5 release of CMU's open source speech recognition system now includes a demo on how to do alignment between a transcript and long speech recordings.

140

answered Sep 20 '22 05:09

dmcer

Related questions
                            
                                Supervised Latent Dirichlet Allocation for Document Classification?
                            
                                Coreference resolution in python nltk using Stanford coreNLP
                            
                                what is dimensionality in word embeddings?
                            
                                Extract Nouns from Text (Java)
                            
                                POS tagging using spaCy
                            
                                Using word2vec to classify words in categories
                            
                                Is it possible to use Google BERT to calculate similarity between two textual documents?
                            
                                NLP: Qualitatively "positive" vs "negative" sentence
                            
                                Why do I need a tokenizer for each language? [closed]
                            
                                Python - Generating the plural noun of a singular noun
                            
                                Transformers v4.x: Convert slow tokenizer to fast tokenizer
                            
                                How do I replace the string exactly using gsub()
                            
                                Replace apostrophe/short words in python
                            
                                what is the difference between bigram and unigram text features extraction
                            
                                FastText using pre-trained word vector for text classification
                            
                                Open Source Library for Linguistic Inquiry and Word Count (LIWC) [closed]
                            
                                Verb Conjugations Database [closed]
                            
                                Clustering of news articles
                            
                                NLP and Machine learning for sentiment analysis [closed]
                            
                                Syntaxnet / Parsey McParseface python API

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Synchronizing text and audio. Is there a NLP/speech-to-text library to do this?

Tags:

nlp

speech-recognition

pattern-recognition

Justin

People also ask

1 Answers

dmcer

Recent Activity

Donate For Us