Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Given a set of words, how can you identify "n" set of letters that will help you make the maximum number of complete words from the original list?

Example:

n = 9
words = {Bike, Tire, Fuel, Biker, Filter, Trike}
output = {B,T,I,K,E,F,U,L,R}

(Order of output is not important. What is important to note is that given a word like FOO, one could not use F,O as the alphabet, but always needed F,O,O. Similar alphabets are treated separately)

What would be the most efficient algorithm to solve this ?
I am thinking along the lines of using the frequency of each character but that does not seem to help much.

like image 686
Vinay Gaba Avatar asked Jul 20 '15 18:07

Vinay Gaba


People also ask

What are the words that you find from the group of letters in any order?

Any word or phrase that exactly reproduces the letters in another order is an anagram. Someone who creates anagrams may be called an "anagrammatist", and the goal of a serious or skilled anagrammatist is to produce anagrams that reflect or comment on their subject.

How many different 5-letter words sequences are there with no repeated letters formed from the 26 letter alphabet?

Using the 26 English letters, the number of 5-letter words that can be made if the letters are distinct is determined as follows: 26P5=26×25×24×23×22=7893600 different words.


2 Answers

EDIT: This was updated for the edited question. See the revision history for details.

Based on the comments, one has to assume (or at least consider the possibility) that this is in fact an NP-complete problem. So until someone prooves or disprooves the actual complexity of this problem, here is a brute-force solution that should at least compute the right output.

EDIT 2.0: As shapiro.yaacov pointed out in his answer, it is indeed NP-complete

It uses some utility class to compute all combinations of a certain number of letters from the initial set of all words. As there are n^k combinations of k letters (given the initial set of n letters), this is clearly not "efficient" in the sense of a polynomial-time solution - but it is not yet clear whether such a solution exists at all.

In order to verify the output in view of the point mentioned in the edited question (namely, that letters had to appear in the resulting list as often as they appeared in the word), I used an example input with words where letters are repeated:

"BIKE", "BIKER", "TRIKE", "BEER", DEER", "SEED", "FEED"

For this input, the program prints

0 letters: [], created words: []
1 letters: [B], created words: []
2 letters: [B, B], created words: []
3 letters: [B, B, B], created words: []
4 letters: [B, E, E, R], created words: [BEER]
5 letters: [B, D, E, E, R], created words: [BEER, DEER]
6 letters: [B, D, E, E, F, R], created words: [BEER, DEER, FEED]
7 letters: [B, D, E, E, F, R, S], created words: [BEER, DEER, SEED, FEED]
8 letters: [B, D, E, E, F, I, K, R], created words: [BIKE, BIKER, BEER, DEER, FEED]

Maybe it can be considered as being helpful, maybe as a starting point or building block for others.

import java.math.BigInteger;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collection;
import java.util.Collections;
import java.util.Comparator;
import java.util.Iterator;
import java.util.LinkedHashSet;
import java.util.List;
import java.util.NoSuchElementException;
import java.util.Set;
import java.util.TreeSet;

public class MaximizeWords
{
    public static void main(String[] args)
    {
        List<String> words = Arrays.asList(
            "BIKE",
            "BIKER",
            "TRIKE",

            "BEER",
            "DEER",
            "SEED",
            "FEED"
        );

        List<Character> allLetters = 
            new ArrayList<Character>(allLettersOf(words));
        for (int n=0; n<=8; n++)
        {
            CombinationIterable<Character> combinations =
                new CombinationIterable<Character>(n, allLetters);

            List<Solution> solutions = new ArrayList<Solution>();
            for (List<Character> combination : combinations)
            {
                Collections.sort(combination);
                Solution solution = new Solution(words, combination);
                solutions.add(solution);
            }
            Solution bestSolution = Collections.max(solutions, 
                new Comparator<Solution>()
            {
                @Override
                public int compare(Solution s0, Solution s1)
                {
                    return Integer.compare(
                        s0.createdWords.size(), s1.createdWords.size());
                }
            });
            System.out.println(bestSolution);
        }
    }

    static class Solution
    {
        List<Character> letters;
        List<String> createdWords;

        public Solution(List<String> words, List<Character> letters)
        {
            this.letters = letters;
            this.createdWords = computeCreatedWords(words, letters);
        }

        @Override
        public String toString()
        {
            return letters.size() + " letters: " + letters
                + ", created words: " + createdWords;
        }
    }

    private static List<String> computeCreatedWords(
        List<String> words, List<Character> letters)
    {
        List<String> createdWords = new ArrayList<String>();
        for (String word : words)
        {
            if (creates(letters, word))
            {
                createdWords.add(word);
            }
        }
        return createdWords;
    }

    private static boolean creates(List<Character> letters, String word)
    {
        List<Character> copy = new ArrayList<Character>(letters);
        for (int i=0; i<word.length(); i++)
        {
            Character c = Character.valueOf(word.charAt(i));
            if (!copy.remove(c))
            {
                return false;
            }
        }
        return true;
    }


    private static List<Character> lettersOf(String word)
    {
        List<Character> letters = new ArrayList<Character>();
        for (int i=0; i<word.length(); i++)
        {
            letters.add(Character.valueOf(word.charAt(i)));
        }
        return letters;
    }

    private static Set<Character> allLettersOf(Iterable<String> words)
    {
        Set<Character> letters = new TreeSet<Character>();
        for (String word : words)
        {
            letters.addAll(lettersOf(word));
        }
        return letters;
    }
}







//=============================================================================
// These classes are taken from https://github.com/javagl/Combinatorics


/**
 * A class providing an iterator over all combinations of a certain number
 * of elements of a given set. For a set S with n = |S|, there are are n^k 
 * combinations of k elements of the set. This is the number of possible
 * samples when doing sampling with replacement. Example:<br />
 * <pre>
 * S = { A,B,C }, n = |S| = 3
 * k = 2 
 * m = n^k = 9
 * 
 * Combinations:
 * [A, A]
 * [A, B]
 * [A, C]
 * [B, A]
 * [B, B]
 * [B, C]
 * [C, A]
 * [C, B]
 * [C, C]
 * </pre>
 *  
 * @param <T> The type of the elements
 */
final class CombinationIterable<T> implements Iterable<List<T>>
{
    /**
     * The input elements
     */
    private final List<T> input;

    /**
     * The sample size
     */
    private final int sampleSize;

    /**
     * The total number of elements that the iterator will provide
     */
    private final int numElements;

    /**
     * Creates an iterable over all multisets of 
     * 'sampleSize' elements of the given array.
     *  
     * @param sampleSize The sample size
     * @param input The input elements
     */
    public CombinationIterable(int sampleSize, List<T> input)
    {
        this.sampleSize = sampleSize;
        this.input = input;
        numElements = (int) Math.pow(input.size(), sampleSize);
    }

    @Override
    public Iterator<List<T>> iterator()
    {
        return new Iterator<List<T>>()
        {
            /**
             * The element counter
             */
            private int current = 0;

            /**
             * The indices of the elements that are currently chosen
             */
            private final int chosen[] = new int[sampleSize];

            @Override
            public boolean hasNext()
            {
                return current < numElements;
            }

            @Override
            public List<T> next()
            {
                if (!hasNext())
                {
                    throw new NoSuchElementException("No more elements");
                }

                List<T> result = new ArrayList<T>(sampleSize);
                for (int i = 0; i < sampleSize; i++)
                {
                    result.add(input.get(chosen[i]));
                }
                increase();
                current++;
                return result;
            }

            /**
             * Increases the k-ary representation of the selection of 
             * elements by one.
             */
            private void increase()
            {
                // The array of 'chosen' elements for a set of size n 
                // effectively is a number represented in k-ary form, 
                // and thus, this method does nothing else than count. 
                // For example, when choosing 2 elements of a set with 
                // n=10, the contents of 'chosen' would represent all
                // values 
                // 00, 01, 02,... 09,
                // 10, 11, 12,... 19,
                // ...
                // 90, 91, 92, ...99
                // with each digit indicating the index of the element
                // of the input array that should be placed at the
                // respective position of the output array.
                int index = chosen.length - 1;
                while (index >= 0)
                {
                    if (chosen[index] < input.size() - 1)
                    {
                        chosen[index]++;
                        return;
                    }
                    chosen[index] = 0;
                    index--;
                }
            }

            @Override
            public void remove()
            {
                throw new UnsupportedOperationException(
                    "May not remove elements from a combination");
            }
        };
    }
}

/**
 * Utility methods used in the combinatorics package
 */
class Utils
{
    /**
     * Utility method for computing the factorial n! of a number n.
     * The factorial of a number n is n*(n-1)*(n-2)*...*1, or more
     * formally:<br />
     * 0! = 1 <br />
     * 1! = 1 <br />
     * n! = n*(n-1)!<br />
     *
     * @param n The number of which the factorial should be computed
     * @return The factorial, i.e. n!
     */
    public static BigInteger factorial(int n)
    {
        BigInteger f = BigInteger.ONE;
        for (int i = 2; i <= n; i++)
        {
            f = f.multiply(BigInteger.valueOf(i));
        }
        return f;
    }    
    /**
     * A magic utility method that happens to return the number of
     * bits that are set to '1' in the given number.
     *  
     * @param n The number whose bits should be counted
     * @return The number of bits that are '1' in n
     */
    public static int countBits(int n)
    {
        int m = n - ((n >> 1) & 033333333333) - ((n >> 2) & 011111111111);
        return ((m + (m >> 3)) & 030707070707) % 63;
    }

    /**
     * Add all elements from the given iterable into the given collection
     * 
     * @param <T> A type that is related to the elements 
     * @param iterable The iterable
     * @param collection The collection
     */
    public static <T> void addAll(
        Iterable<? extends T> iterable, Collection<? super T> collection)
    {
        for (T t : iterable)
        {
            collection.add(t);
        }
    }

    /**
     * Returns all elements from the given iterable as a list
     * 
     * @param <T> A type that is related to the elements 
     * @param iterable The iterable
     * @return The list
     */
    public static <T> List<T> asList(Iterable<? extends T> iterable)
    {
        List<T> list = new ArrayList<T>();
        addAll(iterable, list);
        return list;
    }

    /**
     * Returns all elements from the given iterable as a set
     * 
     * @param <T> A type that is related to the elements 
     * @param iterable The iterable
     * @return The set
     */
    public static <T> Set<T> asSet(Iterable<? extends T> iterable)
    {
        Set<T> set = new LinkedHashSet<T>();
        addAll(iterable, set);
        return set;
    }

    /**
     * Private constructor to prevent instantiation
     */
    private Utils()
    {

    }
}

(Note that, compared to the initial version, there is not much that has changed in the code - basically, instead of using a ChoiceIterable it now uses a CombinationIterable. But the number of combinations is much larger than the number of choices, so this is only feasible for much smaller inputs than the initial solution).

like image 63
Marco13 Avatar answered Oct 12 '22 03:10

Marco13


Finally had some time to look this up and:

This is a variation of the set cover problem - it is actually the maximum coverage problem. And as I suspected, it is NP-hard.

So, to conclude, @Marco13 gave an answer that is the best (asymptotically) you can do. It might be optimized, and other tricks, but basically, that's as good as it gets.

like image 44
shapiro yaacov Avatar answered Oct 12 '22 02:10

shapiro yaacov