How to count the number of spoken syllables in an audio file?

Question

I have many audio files with clean audio and only spoken voice in Mandarin Chinese. I need to estimate of how many syllables are spoken in each file. Is there a tool for OS X, Windows, or Linux that can estimate these?

sample01.wav 15
sample02.wav 8
sample03.wav 5
sample04.wav 1
sample05.wav 18

As there are many files, command-line or batch-capable software is preferred, e.g.:

$ application sample01.wav
15

A solution that uses speech-to-text, then counts the number of characters present would be suitable to.

marsei · Accepted Answer

The automatic segmentation of speech is an active scientific domain, meaning that there is no method that works perfectly.

In 2009, de Jong and Wempe proposed a method to automatically detect syllables in a human speech signal using Praat. This methods compares well with man-made segmentation, and has been employed in many third-party scientific studies. You can find a detailed description of the method in their scientific article (pdf), along with an historical perspective on previously proposed methods. The Praat script per se and a couple of tutorials can be found on a dedicated website (www - speechrate).

You may also be interested in another segmentation algorithm developed by Harma that has been implemented in Matlab (Harma Syllable Segmentation)

Skylion · Answer

You can use formants to determine this. Each syllable should correspond to a formant. Here is more information on formants:

https://en.wikipedia.org/wiki/Formants

How to count the number of spoken syllables in an audio file?

Tags:

nlp

speech-recognition

Village

2 Answers

marsei

Skylion

Recent Activity

Donate For Us

How to count the number of spoken syllables in an audio file?

Tags:

nlp

speech-recognition

Village

2 Answers

marsei

Skylion

Related questions

Recent Activity

Donate For Us