Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to count the number of spoken syllables in an audio file?

I have many audio files with clean audio and only spoken voice in Mandarin Chinese. I need to estimate of how many syllables are spoken in each file. Is there a tool for OS X, Windows, or Linux that can estimate these?

sample01.wav 15
sample02.wav 8
sample03.wav 5
sample04.wav 1
sample05.wav 18

As there are many files, command-line or batch-capable software is preferred, e.g.:

$ application sample01.wav
15
  • A solution that uses speech-to-text, then counts the number of characters present would be suitable to.
like image 395
Village Avatar asked Dec 12 '22 10:12

Village


2 Answers

The automatic segmentation of speech is an active scientific domain, meaning that there is no method that works perfectly.

In 2009, de Jong and Wempe proposed a method to automatically detect syllables in a human speech signal using Praat. This methods compares well with man-made segmentation, and has been employed in many third-party scientific studies. You can find a detailed description of the method in their scientific article (pdf), along with an historical perspective on previously proposed methods. The Praat script per se and a couple of tutorials can be found on a dedicated website (www - speechrate).

You may also be interested in another segmentation algorithm developed by Harma that has been implemented in Matlab (Harma Syllable Segmentation)

like image 188
marsei Avatar answered Feb 01 '23 06:02

marsei


You can use formants to determine this. Each syllable should correspond to a formant. Here is more information on formants:

https://en.wikipedia.org/wiki/Formants

like image 23
Skylion Avatar answered Feb 01 '23 07:02

Skylion