I'm trying to is generate all possible syllable combinations for a given word. The process for identifying what's a syllable isn't relevant here, but it's the generating of all combinations that's giving me a problem. I think this is probably possible to do recursively in a few lines I think (though any other way is fine), but I'm having trouble getting it working. Can anyone help ?
    // how to test a syllable, just for the purpose of this example
    bool IsSyllable(string possibleSyllable) 
    {
        return Regex.IsMatch(possibleSyllable, "^(mis|und|un|der|er|stand)$");
    }
    List<string> BreakIntoSyllables(string word)
    {
       // the code here is what I'm trying to write 
       // if 'word' is "misunderstand" , I'd like this to return
       //  => {"mis","und","er","stand"},{ "mis","un","der","stand"}
       // and for any other combinations to be not included
    }
                Use the Split method when the substrings you want are separated by a known delimiting character (or characters). Regular expressions are useful when the string conforms to a fixed pattern. Use the IndexOf and Substring methods in conjunction when you don't want to extract all of the substrings in a string.
The total number of substrings formed by string of length N is (N*(N+1))/2, initialise count as (N*(N+1))/2.
A substring is a subset or part of another string, or it is a contiguous sequence of characters within a string.
Try starting with this:
var word = "misunderstand";
Func<string, bool> isSyllable =
    t => Regex.IsMatch(t, "^(mis|und|un|der|er|stand)$");
var query =
    from i in Enumerable.Range(0, word.Length)
    from l in Enumerable.Range(1, word.Length - i)
    let part = word.Substring(i, l)
    where isSyllable(part)
    select part;
This returns:

Does that help to begin with at least?
EDIT: I thought a bit further about this problem an came up with this couple of queries:
Func<string, IEnumerable<string[]>> splitter = null;
splitter =
    t =>
        from n in Enumerable.Range(1, t.Length - 1)
        let s = t.Substring(0, n)
        let e = t.Substring(n)
        from g in (new [] { new [] { e } }).Concat(splitter(e))
        select (new [] { s }).Concat(g).ToArray();
var query =
    from split in (new [] { new [] { word } }).Concat(splitter(word))
    where split.All(part => isSyllable(part))
    select split;
Now query return this:

Let me know if that's nailed it now.
Normally this type of problems is solved using Tries. I will base my implementation of a Trie on How to create a trie in c# (but notice that I have rewritten it).
var trie = new Trie(new[] { "un", "que", "stio", "na", "ble", "qu", "es", "ti", "onable", "o", "nable" });
//var trie = new Trie(new[] { "u", "n", "q", "u", "e", "s", "t", "i", "o", "n", "a", "b", "l", "e", "un", "qu", "es", "ti", "on", "ab", "le", "nq", "ue", "st", "io", "na", "bl", "unq", "ues", "tio", "nab", "nqu", "est", "ion", "abl", "que", "stio", "nab" });
var word = "unquestionable";
var parts = new List<List<string>>();
Split(word, 0, trie, trie.Root, new List<string>(), parts);
//
public static void Split(string word, int index, Trie trie, TrieNode node, List<string> currentParts, List<List<string>> parts)
{   
    // Found a syllable. We have to split: one way we take that syllable and continue from it (and it's done in this if).
    // Another way we ignore this possible syllable and we continue searching for a longer word (done after the if)
    if (node.IsTerminal)
    {
        // Add the syllable to the current list of syllables
        currentParts.Add(node.Word);
        // "covered" the word with syllables
        if (index == word.Length)
        {
            // Here we make a copy of the parts of the word. This because the currentParts list is a "working" list and is modified every time.
            parts.Add(new List<string>(currentParts));
        }
        else
        {
            // There are remaining letters in the word. We restart the scan for more syllables, restarting from the root.
            Split(word, index, trie, trie.Root, currentParts, parts);
        }
        // Remove the syllable from the current list of syllables
        currentParts.RemoveAt(currentParts.Count - 1);
    }
    // We have covered all the word with letters. No more work to do in this subiteration
    if (index == word.Length)
    {
        return;
    }
    // Here we try to find the edge corresponding to the current character
    TrieNode nextNode;
    if (!node.Edges.TryGetValue(word[index], out nextNode))
    {
        return;
    }
    Split(word, index + 1, trie, nextNode, currentParts, parts);
}
public class Trie
{
    public readonly TrieNode Root = new TrieNode();
    public Trie()
    {
    }
    public Trie(IEnumerable<string> words)
    {
        this.AddRange(words);
    }
    public void Add(string word)
    {
        var currentNode = this.Root;
        foreach (char ch in word)
        {
            TrieNode nextNode;
            if (!currentNode.Edges.TryGetValue(ch, out nextNode))
            {
                nextNode = new TrieNode();
                currentNode.Edges[ch] = nextNode;
            }
            currentNode = nextNode;
        }
        currentNode.Word = word;
    }
    public void AddRange(IEnumerable<string> words)
    {
        foreach (var word in words)
        {
            this.Add(word);
        }
    }
}
public class TrieNode
{
    public readonly Dictionary<char, TrieNode> Edges = new Dictionary<char, TrieNode>();
    public string Word { get; set; }
    public bool IsTerminal
    {
        get
        {
            return this.Word != null;
        }
    }
}
word is the string you are interested in, parts will contain the list of lists of possible syllables (it would probably be more correct to make it a List<string[]>, but it's quite easy to do it. Instead of parts.Add(new List<string>(currentParts)); write parts.Add(currentParts.ToArray()); and change all the List<List<string>> to List<string[]>.
I'll add a variant of Enigmativity response thas is theretically faster than his because it discards wrong syllables immediately instead of post-filtering them later. If you like it, you should give him a +1, because without his idea, this variant wouldn't be possible. But note that it's still an hack. The "right" solution is in using Trie(s) :-)
Func<string, bool> isSyllable = t => Regex.IsMatch(t, "^(un|que|stio|na|ble|qu|es|ti|onable|o|nable)$");
Func<string, IEnumerable<string[]>> splitter = null;
splitter =
    t =>
        (
        from n in Enumerable.Range(1, t.Length - 1)
        let s = t.Substring(0, n)
        where isSyllable(s)
        let e = t.Substring(n)
        let f = splitter(e)
        from g in f
        select (new[] { s }).Concat(g).ToArray()
        )
        .Concat(isSyllable(t) ? new[] { new string[] { t } } : new string[0][]);
var parts = splitter(word).ToList();
An explanation:
        from n in Enumerable.Range(1, t.Length - 1)
        let s = t.Substring(0, n)
        where isSyllable(s)
We calculate all the possible syllables of a word, from length 1 to the length of the word - 1 and check if it's a syllable. We weed out directly the non-syllables. The full word as a syllable will be checked later.
        let e = t.Substring(n)
        let f = splitter(e)
We search the syllables of the remaining part of the string
        from g in f
        select (new[] { s }).Concat(g).ToArray()
And we chain the found syllables with the "current" syllable. Note that we are creating many useless arrays. If we accept to have an IEnumerable<IEnumerable<string>> as the result, we can take away this ToArray.
(we could rewrite many rows together, deleting many let, like 
        from g in splitter(t.Substring(n))
        select (new[] { s }).Concat(g).ToArray()
but we won't do it for clarity)
And we concatenate the "current" syllable with the found syllables.
        .Concat(isSyllable(t) ? new[] { new string[] { t } } : new string[0][]);
Here we could rebuild the query a little so to not use this Concat and create empty arrays, but it would be a little complex (we could rewrite the entire lambda function as a isSyllable(t) ? new[] { new string[] { t } }.Concat(oldLambdaFunction) : oldLambdaFunction)
In the end, if the whole word is a syllable, we add the whole word as a syllable. Otherwise we Concat an empty array (so no Concat)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With