Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find words and combinations of words that can be spoken the quickest

I'm a big fan of discovering sentences that can be rapped very quickly. For example, "gotta read a little bit of Wikipedia" or "don't wanna wind up in the gutter with a bottle of malt." (George Watsky)

I wanted to write a program in Python that would enable me to find words (or combinations of words) that can be articulated such that it sounds very fast when spoken.

I initially thought that words that had a high syllable to letter ratio would be the best, but upon writing a Python program to do find those words, I retrieved only very simple words that didn't really sound fast (e.g. "iowa").

So I'm at a loss at what actually makes words sound fast. Is it the morpheme to letter ratio? Is it the number of alternating vowel-consonant pairs?

How would you guys go about devising a python program to resolve this problem?

like image 696
Parseltongue Avatar asked Feb 27 '12 03:02

Parseltongue


2 Answers

This is just a stab in the dark as I'm not a linguist (although, I have written a voice synthesizer), the metric that be useful here is the number of phonemes that make up each word, since the phonemes themselves are going to be the same approximate duration regardless of use. There's an International Phonetic Alphabet chart for english dialects, as well as a nice phonology of English.

A good open-source phonetic dictionary is available from the cmudict project which has about 130k words

Here's a really quick stab at a look up program:

#!/usr/bin/python

import re

words={}

for line in open("cmudict.0.7a",'ro').readlines():
    split_idx = line.find(' ')
    words[line[0:split_idx]] = line[split_idx+1:-1]

user_input = raw_input("Words: ")

print
for word in user_input.split(' '):
    try:
        print "%25s %s" % (word, words[word.upper()])
    except:
        print "%25s %s" % (word, 'unable to find phonems for word')

When run..

Words: I support hip hop from the underground up

                    I  AY1
              support  S AH0 P AO1 R T
                  hip  HH IH1 P
                  hop  HH AA1 P
                 from  F R AH1 M
                  the  DH AH0
          underground  AH1 N D ER0 G R AW2 N D
                   up  AH1 P

If you want to get super fancy pants about this, there's always the Python Natural Language Toolkit which may have some useful tidbits for you.

Additionally, some real world use.. although to be fair, I fixed 'stylin' to 'styling'.. But left 'tellin' to reveal the deficiency of unknown words.. You could probably try a lookup for words ending with in' by subbing the g in for the apostrophe and then drop the NG phoneme from the lookup..

                  Yes  Y EH1 S
                  the  DH AH0
               rhythm  R IH1 DH AH0 M
                  the  DH AH0
                rebel  R EH1 B AH0 L
              Without  W IH0 TH AW1 T
                    a  AH0
                pause  P AO1 Z
                  I'm  AY1 M
             lowering  L OW1 ER0 IH0 NG
                   my  M AY1
                level  L EH1 V AH0 L
                  The  DH AH0
                 hard  HH AA1 R D
               rhymer  R AY1 M ER0
                where  W EH1 R
                  you  Y UW1
                never  N EH1 V ER0
                 been  B IH1 N
                  I'm  AY1 M
                   in  IH0 N
                  You  Y UW1
                 want  W AA1 N T
              styling  S T AY1 L IH0 NG
                  you  Y UW1
                 know  N OW1
                 it's  IH1 T S
                 time  T AY1 M
                again  AH0 G EH1 N
                    D  D IY1
                  the  DH AH0
                enemy  EH1 N AH0 M IY0
               tellin unable to find phonems for word
                  you  Y UW1
                   to  T UW1
                 hear  HH IY1 R
                   it  IH1 T
                 They  DH EY1
              praised  P R EY1 Z D
              etc...

If this is something you plan on putting some time into, I'd be interested in helping. I think putting 'Worlds first rapping IDE' on my resume would be hilarious. And if one exists already, world's first Python based rapping IDE. :p

like image 100
synthesizerpatel Avatar answered Oct 27 '22 10:10

synthesizerpatel


I would say it's a good idea to start by taking the examples you gave or other ones you like and doing some sort of analysis for all your ideas on them: e.g. phoneme to to letter ratio, etc; whatever sounds reasonable and that you can calculate. The more samples the better. Hopefully this will give you a good idea of what properties the lines and words you already enjoy share, which should lead you in the right direction.

Otherwise, my laymen's guess is that short vowels (obviously) and hard consonants like 't', some 'p's, hard 'g's, etc, will be best - they make the lines sound staccato and rapid-fire.

(wanted to leave this as a comment cause it's not really an answer, but it's too long :)

like image 39
Alexander Corwin Avatar answered Oct 27 '22 10:10

Alexander Corwin