Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

given a word, print its index, words can be increased accordingly

Tags:

algorithm

Imagine an alphabet of words.

Example:

 a ==> 1 
 b ==> 2 
 c ==> 3 

 z ==> 26 
 ab ==> 27 
 ac ==> 28 

 az ==> 51 
 bc ==> 52 
 and so on. 

Such that the sequence of characters need to be in ascending order only (ab is valid but ba is not). Given any word print its index if valid and 0 if not.

 Input  Output 

 ab      27 

 ba       0 

 aez     441 

In this question, I can do math easily but I am not getting any algorithm.

What i did in math is

one letter 26

two letter 325

.

.

.

so on

like image 855
user2416871 Avatar asked Jul 05 '13 18:07

user2416871


2 Answers

  1. First make all the letters numbers:
    • 'aez' would become 1,5,26
  2. Make these numbers variables called ...X3,X2,X1
    • 26 would be X1, 5 would be X2, 1 would be X3 (note, right to left)

Now for the magic formula:

enter image description here

Coded with examples and demonstration of speed even in worse case scenario:

def comb(n,k): #returns combinations
    p = 1 #product
    for i in range(k):
        p *= (n-i)/(i+1)
    return p

def solve(string):
    x = []
    for letter in string:
        x.append(ord(letter)-96)  #convert string to list of integers
    x = list(reversed(x))  #reverse the order of string
    #Next, the magic formula
    return x[0]+sum(comb(26,i)-comb(26-x[i-1]+1,i)*(1-i/(26-x[i-1]+1)) for i in range(2,len(x)+1))

solve('bhp')
764.0
>>> solve('afkp')
3996.0
>>> solve('abcdefghijklmnopqrstuvwxyz')
67108863.0
>>> solve('hpz')
2090.0
>>> solve('aez')
441.0
>>> if 1:
        s = ''
        for a in range(97,97+26):
                s += chr(a)
        t = time.time()
        for v in range(1000):
                temp = solve(s)
        print (time.time()-t)


0.1650087833404541

In order to understand my explanation to this formula, I need to go over a mathematical occurrence in pascal's triangle and the binomial theorem:

Here is pascal's triangle:

enter image description here

Going from top right to bottom left, first there is a sequence of 1s. Then a sequence of the counting numbers. The next sequence is the sum of the counting numbers. These are known as the triangular numbers. The next sequence is the sum of the triangular numbers, known as the tetrahedral numbers and this pattern goes on and on.

Now for the binomial theorem:

enter image description here

By combining the binomial theorem and pascals triangle, it can be seen that the nth triangular number is:

enter image description here

and nth tetrahedral number is:

enter image description here

the sum of the first n tetrahedral numbers is:

enter image description here

and on ...

Now for the explanation. For this explanation, I will only use 6 letters, a-f, and will replace these with the numbers 1-6. The procedure is the same with more letters

If the length is 1, then the possible sequences are:

1
2
3
4
5
6

In this the answer is simply the value

Now for a length of 2:

12  13  14  15  16
23  24  25  26
34  35  36
45  46
56

To solve this we split it into 3 parts:

  1. Find the total number of elements in the rows above
    • In this case, there are 5 elements in the first row, 4 in the second, 3 in the 3rd and so forth. What we have to do is find a way to sum the first n elements of the sequence (5,4,3,2,1). In order to do this, we subtract triangular numbers. (1+2+3+4+5)-(1+2+3) = (4+5). Similarly (1+2+3+4+5)-(1+2) = 3+4+5. Therefore this value is equal to:
      • enter image description here
  2. Now, we have accounted for the values above our target and are only concerned with the column it is in. To find this, we add x1-x2
  3. Lastly, we need to add the amount of length 1 sequences there are. This is equal to 6. Therefore, our formula is:
    • enter image description here

Next we will repeat for sequences of length 3:

123  124  125  126
134  135  136
145  146
156

234  235  236
245  246
256

345  346
356

456

Once again we split this problem into steps:

  1. Find how many elements are above each array. The arrays values are the backwards triangular numbers (10, 6, 3, 1). This time, instead of subtracting triangular numbers we subtract tetrahedral numbers:
    • enter image description here
  2. Notice how each individual array has the shape of a length 2 sequence. By subtracting x3 from x1, and x2, we reduce the sequence to degree 2. For example, we will subtract 2 from the second array

This

234  235  236
245  246
256

becomes

12  13  14
23  24
34  
  1. We can now use the length 2 formula, with 6-x3 instead of 6, because our sequences now have a different maximum value
    • enter image description here
  2. Lastly, we add the total number of length 1 and length 2 sequences. It turns out there is a pattern for how many sequences of a particular length there are. The answer is combinations. There are enter image description here sequences of length 1, enter image description here of length 2, and so on.

Combining these our total formula for length 3 becomes:

enter image description here

We can follow this pattern of reduction for higher length sequences

Now we will right out our formulas to look for patterns:

Length 1: y1

Length 2:

Length 3:

enter image description here

Note: I also used length 4 to make sure the patterns held

With a bit of math, grouping of terms, and the change from 6 to 26 our formula becomes:

In order to simplify this further, more math must be done.
This identity holds true for all a and b. For a quick fun exercise, prove it (not really difficult):

enter image description here

This identity allows as to further group and negate terms to reach our much oversimplified formula:

like image 168
enderx1x Avatar answered Oct 23 '22 07:10

enderx1x


This is a combination of two problems: parsing a number in a base that isn't 10 and determining if input is sorted.

Note that, since this is probably homework, you probably can't just use existing methods to do the hard work.

like image 38
Craig Gidney Avatar answered Oct 23 '22 07:10

Craig Gidney