Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find all the possible words using adjacent letters in a matrix

I have the following test matrix:

a l i
g t m
j e a

I intend to create an algorithm that helps me find every possible word from a given minimum length to a maximum length using adjacent letters only.

For example:

Minimum: 3 letters

Maximum: 6 letters

Based on the test matrix, I should have the following results:

  • ali
  • alm
  • alg
  • alt
  • ati
  • atm
  • atg
  • ...
  • atmea

etc.

I created a test code (C#) that has a custom class which represents the letters.

Each letter knows its neighbors and has a generation counter (for keeping track of them during traversal).

Here is its code:

public class Letter
{
    public int X { get; set; }
    public int Y { get; set; }

    public char Character { get; set; }

    public List<Letter> Neighbors { get; set; }

    public Letter PreviousLetter { get; set; }

    public int Generation { get; set; }

    public Letter(char character)
    {
        Neighbors = new List<Letter>();
        Character = character;
    }

    public void SetGeneration(int generation)
    {
        foreach (var item in Neighbors)
        {
            item.Generation = generation;
        }
    }
}

I figured out that if I want it to be dynamic, it has to be based on recursion.

Unfortunately, the following code creates the first 4 words, then stops. It is no wonder, as the recursion is stopped by the specified generation level.

The main problem is that the recursion returns only one level but it would be better to return to the starting point.

 private static void GenerateWords(Letter input, int maxLength, StringBuilder sb)
    {
        if (input.Generation >= maxLength)
        {               
            if (sb.Length == maxLength)
            {
                allWords.Add(sb.ToString());
                sb.Remove(sb.Length - 1, 1);
            }                
            return;
        }
        sb.Append(input.Character);
        if (input.Neighbors.Count > 0)
        {
            foreach (var child in input.Neighbors)
            {
                if (input.PreviousLetter == child)
                    continue;
                child.PreviousLetter = input;
                child.Generation = input.Generation + 1;
                GenerateWords(child, maxLength, sb);
            }
        }
    }

So, I feel a little stuck, any idea how I should proceed?

like image 907
Nestor Avatar asked Dec 29 '16 00:12

Nestor


2 Answers

From here, you can treat this as a graph traversal problem. You start at each given letter, finding each path of length min_size to max_size, with 3 and 6 as those values in your example. I suggest a recursive routine that builds the words as paths through the grid. This will look something like the following; replace types and pseudo-code with your preferences.

<array_of_string> build_word(size, current_node) {
    if (size == 1)  return current_node.letter as an array_of_string;
    result = <empty array_of_string>
    for each next_node in current_node.neighbours {
        solution_list = build_word(size-1, next_node);
        for each word in solution_list {
             // add current_node.letter to front of that word.
             // add this new word to the result array
        }
    }
    return the result array_of_string
}

Does that move you toward a solution?

like image 113
Prune Avatar answered Nov 18 '22 21:11

Prune


When solving these kind of problems, I tend to use immutable classes because everything is so much easier to reason about. The following implementation makes use of a ad hoc ImmutableStack because its pretty straightforward to implement one. In production code I'd probably want to look into System.Collections.Immutable to improve performance (visited would be an ImmutableHashSet<> to point out the obvious case).

So why do I need an immutable stack? To keep track of the current character path and visited "locations" inside the matrix. Because the selected tool for the job is immutable, sending it down recursive calls is a no brainer, we know it can't change so I don't have to worry about my invariants in every recursion level.

So lets implement an immutable stack.

We'll also implement a helper class Coordinates that encapsulates our "locations" inside the matrix, will give us value equality semantics and a convenient way to obtain valid neighbors of any given location. It will obviously come in handy.

public class ImmutableStack<T>: IEnumerable<T>
{
    private readonly T head;
    private readonly ImmutableStack<T> tail;

    public static readonly ImmutableStack<T> Empty = new ImmutableStack<T>(default(T), null);
    public int Count => this == Empty ? 0 : tail.Count + 1;

    private ImmutableStack(T head, ImmutableStack<T> tail)
    {
        this.head = head;
        this.tail = tail;
    }

    public T Peek()
    {
        if (this == Empty)
            throw new InvalidOperationException("Can not peek an empty stack.");

        return head;
    }

    public ImmutableStack<T> Pop()
    {
        if (this == Empty)
            throw new InvalidOperationException("Can not pop an empty stack.");

        return tail;
    }

    public ImmutableStack<T> Push(T value) => new ImmutableStack<T>(value, this);

    public IEnumerator<T> GetEnumerator()
    {
        var current = this;

        while (current != Empty)
        {
            yield return current.head;
            current = current.tail;
        }
    }

    IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
}

struct Coordinates: IEquatable<Coordinates>
{
    public int Row { get; }
    public int Column { get; }

    public Coordinates(int row, int column)
    {
        Row = row;
        Column = column;
    }

    public bool Equals(Coordinates other) => Column == other.Column && Row == other.Row;
    public override bool Equals(object obj)
    {
        if (obj is Coordinates)
        {
            return Equals((Coordinates)obj);
        }

        return false;
    }

    public override int GetHashCode() => unchecked(27947 ^ Row ^ Column);

    public IEnumerable<Coordinates> GetNeighbors(int rows, int columns)
    {
        var increasedRow = Row + 1;
        var decreasedRow = Row - 1;
        var increasedColumn = Column + 1;
        var decreasedColumn = Column - 1;
        var canIncreaseRow = increasedRow < rows;
        var canIncreaseColumn = increasedColumn < columns;
        var canDecreaseRow = decreasedRow > -1;
        var canDecreaseColumn = decreasedColumn > -1;

        if (canDecreaseRow)
        {
            if (canDecreaseColumn)
            {
                yield return new Coordinates(decreasedRow, decreasedColumn);
            }

            yield return new Coordinates(decreasedRow, Column);

            if (canIncreaseColumn)
            {
                yield return new Coordinates(decreasedRow, increasedColumn);
            }
        }

        if (canIncreaseRow)
        {
            if (canDecreaseColumn)
            {
                yield return new Coordinates(increasedRow, decreasedColumn);
            }

            yield return new Coordinates(increasedRow, Column);

            if (canIncreaseColumn)
            {
                yield return new Coordinates(increasedRow, increasedColumn);
            }
        }

        if (canDecreaseColumn)
        {
            yield return new Coordinates(Row, decreasedColumn);
        }

        if (canIncreaseColumn)
        {
            yield return new Coordinates(Row, increasedColumn);
        }
    }
}

Ok, now we need a method that traverses the matrix visiting each position once returning words that have a specified minimum number of characters and don't exceed a specified maximum.

public static IEnumerable<string> GetWords(char[,] matrix,
                                           Coordinates startingPoint,
                                           int minimumLength,
                                           int maximumLength)

That looks about right. Now, when recursing we need to keep track of what characters we've visited, That's easy using our immutable stack, so our recursive method will look like:

static IEnumerable<string> getWords(char[,] matrix,
                                    ImmutableStack<char> path,
                                    ImmutableStack<Coordinates> visited,
                                    Coordinates coordinates,
                                    int minimumLength,
                                    int maximumLength)

Now the rest is just plumbing and connecting the wires:

public static IEnumerable<string> GetWords(char[,] matrix,
                                           Coordinates startingPoint,
                                           int minimumLength,
                                           int maximumLength)
    => getWords(matrix,
                ImmutableStack<char>.Empty,
                ImmutableStack<Coordinates>.Empty,
                startingPoint,
                minimumLength,
                maximumLength);


static IEnumerable<string> getWords(char[,] matrix,
                                    ImmutableStack<char> path,
                                    ImmutableStack<Coordinates> visited,
                                    Coordinates coordinates,
                                    int minimumLength,
                                    int maximumLength)
{
    var newPath = path.Push(matrix[coordinates.Row, coordinates.Column]);
    var newVisited = visited.Push(coordinates);

    if (newPath.Count > maximumLength)
    {
        yield break;
    }
    else if (newPath.Count >= minimumLength)
    {
        yield return new string(newPath.Reverse().ToArray());
    }

    foreach (Coordinates neighbor in coordinates.GetNeighbors(matrix.GetLength(0), matrix.GetLength(1)))
    {
        if (!visited.Contains(neighbor))
        {
            foreach (var word in getWords(matrix,
                                          newPath,
                                          newVisited,
                                          neighbor,
                                          minimumLength,
                                          maximumLength))
            {
                yield return word;
            }
        }
    }
}

And we're done. Is this the most elegant or fastest algorithm? Probably not, but I find it the most understandable and therefore maintainable. Hope it helps you out.

UPDATE Based upon comments below, I've run a few test cases one of which is:

var matrix = new[,] { {'a', 'l'},
                      {'g', 't'} };
var words = GetWords(matrix, new Coordinates(0,0), 2, 4);
Console.WriteLine(string.Join(Environment.NewLine, words.Select((w,i) => $"{i:00}: {w}")));

And the outcome is the expected:

00: ag
01: agl
02: aglt
03: agt
04: agtl
05: at
06: atl
07: atlg
08: atg
09: atgl
10: al
11: alg
12: algt
13: alt
14: altg
like image 25
InBetween Avatar answered Nov 18 '22 21:11

InBetween