Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Algorithm for largest word formed from perodic table elements

Tags:

algorithm

word

I want to write an algorithm for the following problem scenario

Taking periodic table elements' names, find largest word that can be formed? The symbols such as Na , Ne etc should be regarded as single elements.

This was asked in a reputed company's job interview. Can anybody please help me out solve this.

like image 316
Ritesh Avatar asked May 15 '14 09:05

Ritesh


People also ask

What is the longest word you can make out of the periodic table?

Can you name the longest word made using periodic table symbols for the chemical elements? The answer in English is nonrepresentationalisms. This word is 23 letters long.

What sentences can you make with the periodic table?

The periodic table is ordered according to periods and groups. Hydrogen is the first element of the periodic table. Most of the elements of the periodic table are metals. One of the halogens on the periodic table is the element chlorine.


Video Answer


3 Answers

I think a better way is to check every word in a dictionary and see if it can be made from the names of the elements. To check every permutation of the elements would be harder to program and would be less efficient.

While I agree it is easier to produce the combinations, there are just way too many of them, and as you said tend to infinity if you don't give a limit. The production of the word with the symbols would be slightly more difficult and finicky, but I don't think it would be too difficult.

E.g. when you get a word you could search through the elements looking for elements that could make up your word, then using that set of elements try and fill in the letters from start to finish. Obviously this gets harder with elements that aren't 2 letters and words that are odd in length.

You can actually use a sort of graph. Say you have 'silicon'. You can start with the letter 'S' or 'SI' which are both in the table. From there choose 'SI' because it is closer to your solution. If 'SI' does not lead to your solution you will have to come back and see if 'S' would work.

So it works as a kind of depth first search.

like image 181
Makoto Avatar answered Oct 08 '22 04:10

Makoto


How to express a given word as a chemical compound? Here's a dynamic programming solution. The important line is "Let progress[i] be the solution to the subproblem for word[:i]". The rest follows.

elements = "H He Li Be B C N O F Ne Na Mg Al Si P S Cl Ar K Ca Sc Ti V Cr Mn Fe Co Ni Cu Zn Ga Ge As Se Br Kr Rb Sr Y Zr Nb Mo Tc Ru Rh Pd Ag Cd In Sn Sb Te I Xe Cs Ba La Ce Pr Nd Pm Sm Eu Gd Tb Dy Ho Er Tm Yb Lu Hf Ta W Re Os Ir Pt Au Hg Tl Pb Bi Po At Rn Fr Ra Ac Th Pa U Np Pu Am Cm Bk Cf Es Fm Md No Lr Rf Db Sg Bh Hs Mt Ds Rg Cn Uut Fl Uup Lv Uus Uuo".split()

def decompose(word):
    """Express given word as chemical compound. If there are multiple solutions, return one of minimal weight."""
    progress = [False for x in range(len(word)+1)] # solution for word[:i]
    progress[0] = []

    for i in range(1, len(word)+1):
        possibles = list()
        for j in range(max(i-3,0), i):
            if progress[j] == False:
                continue
            alchemical = word[j:i].title()
            if alchemical in elements:
                possibles.append(progress[j] + [alchemical])

        if possibles:
            # choose minimal solution
            progress[i] = min(possibles, key=len)

    if progress[-1] == False:
        return False
    return "".join(progress[-1])

assert decompose("sine") == "SiNe" # rather than S-I-Ne
assert decompose("bismuth") == "BiSmUTh"
assert decompose("jam") == False

https://gist.github.com/hickford/750135db633832913703


Ran this on a whole dictionary, the longest compounds were:

  • nonrepresentationalism NoNRePReSeNTaTiONaLiSm
  • thermophosphorescence ThErMoPHOsPHoReSCeNCe
  • bronchoesophagoscopy BrONCHoEsOPHAgOsCoPY
  • hyperphosphorescence HYPErPHOsPHoReSCeNCe
  • hypsibrachycephalism HYPSiBrAcHYCePHAlISm
like image 23
Colonel Panic Avatar answered Oct 08 '22 03:10

Colonel Panic


Generating all words and checking if they exist is impractical. There are 118^L words of length L, a too fast growing function. 1,643,032 three symbols words !

The other way round, as suggested by Makoto, is much more efficient. Try to reconstruct every word from a dictionary. There are on the order of 250,000 English words.

Checking a word is straightforward: see if any element symbol matches the start of the word, and continue with the remaining characters.

You must try all possible matches as a match could hide another. (I mean word ABC could be matched by symbol A and then be stuck because BC is not available, but it could be that AB and C are.) So the matching function will be recursive. I don't expect an exponential explosion in this matching process.

For maximum efficiency, you will store the symbols in a trie structure http://en.wikipedia.org/wiki/Trie.

Final hint: as you are just asked to find the longest match, try the words by decreasing lengths and stop at the first match.

Here is a simple solution in Python, without any optimization. Matching is done right to left, to allow printing the sequence of symbols in post-order:

Words= ['K', 'NaH', 'NaNaNaNa', 'HaKuNa']
Symbols= ['Na','K','H']

def Match(Word):
    # Match the end of the word with every symbol
    for S in Symbols:
        # In case of a partial match, recurse on the prefix
        if S == Word or (S == Word[-len(S):] and Match(Word[:-len(S)])):
            print S,
            return True

    # No match, report failure
    return False

for W in Words:
    if Match(W):
        print


>>> 
K
Na H
Na Na Na Na
like image 1
Yves Daoust Avatar answered Oct 08 '22 04:10

Yves Daoust