I'm implementing a program that calculates an equation: F(n) = F(n-1) + 'a' + func1(func2(F(n-1))).
func1 takes every 'a' and makes it 'c' and every 'c' becomes 'a'.
func2 reverses the string (e.x. "xyz" becomes "zyx").
I want to calculate the Kth character of F(10**2017). The basic rules are F(0) = "" (empty string), and examples are F(1) = "a", F(2) = "aac", and so on.
How do I do this efficiently?
The basic part of my code is this:
def op1 (str1):
if str1 == 'a':
return 'c'
else:
return 'a'
def op2 (str2):
return str2[::-1]
sinitial = ''
while (counter < 10**2017):
Finitial = Finitial + 'a' + op1(op2(Finitial))
counter += 1
print Finitial
To solve basic exponents, multiply the base number repeatedly for the number of factors represented by the exponent. If you need to add or subtract exponents, the numbers must have the same base and exponent.
Let's start by fixing your original code and defining a function to compute F(n)
for small n
. We'll also print out the first few values of F
. All code below is for Python 3; if you're using Python 2, you'll need to make some minor changes, like replacing str.maketrans
with string.maketrans
and range
with xrange
.
swap_ac = str.maketrans({ord('a'): 'c', ord('c'): 'a'})
def F(n):
s = ''
for _ in range(n):
s = s + 'a' + s[::-1].translate(swap_ac)
return s
for n in range(7):
print("F({}) = {!r}".format(n, F(n)))
This gives the following output:
F(0) = ''
F(1) = 'a'
F(2) = 'aac'
F(3) = 'aacaacc'
F(4) = 'aacaaccaaaccacc'
F(5) = 'aacaaccaaaccaccaaacaacccaaccacc'
F(6) = 'aacaaccaaaccaccaaacaacccaaccaccaaacaaccaaaccacccaacaacccaaccacc'
A couple of observations at this point:
F(n)
is a string of length 2**n-1
. That means that F(n)
grows fast. Computing F(50)
would already require some serious hardware: even if we stored one character per bit, we'd need over 100 terabytes to store the full string. F(200)
has more characters than there are estimated atoms in the solar system. So the idea of computing F(10**2017)
directly is laughable: we need a different approach.
By construction, each F(n)
is a prefix of F(n+1)
. So what we really have is a well-defined infinite string, where each F(n)
merely gives us the first 2**n-1
characters of that infinite string, and we're looking to compute its k
th character. And for any practical purpose, F(10**2017)
might as well be that infinite string: for example, when we do our computation, we don't need to check that k < 2**(10**2017)-1
, since a k
exceeding this can't even be represented in normal binary notation in this universe.
Luckily, the structure of the string is simple enough that computing the k
th character directly is straightforward. The major clue comes when we look at the characters at even and odd positions:
>>> F(6)[::2]
'acacacacacacacacacacacacacacacac'
>>> F(6)[1::2]
'aacaaccaaaccaccaaacaacccaaccacc'
The characters at even positions simply alternate between a
and c
(and it's straightforward to prove that this is true, based on the construction). So if our k
is even, we can simply look at whether k/2
is odd or even to determine whether we'll get an a
or a c
.
What about the odd positions? Well F(6)[1::2]
should look somewhat familiar: it's just F(5)
:
>>> F(6)[1::2] == F(5)
True
Again, it's straightforward to prove (e.g., by induction) that this isn't simply a coincidence, and that F(n+1)[1::2] == F(n)
for all nonnegative n
.
We now have an effective way to compute the k
th character in our infinite string: if k
is even, we just look at the parity of k/2
. If k
is odd, then we know that the character at position k
is equal to that at position (k-1)/2
. So here's a first solution to computing that character:
def char_at_pos(k):
"""
Return the character at position k of the string F(n), for any
n satisfying 2**n-1 > k.
"""
while k % 2 == 1:
k //= 2
return 'ac'[k//2%2]
And a check that this does the right thing:
>>> ''.join(char_at_pos(i) for i in range(2**6-1))
'aacaaccaaaccaccaaacaacccaaccaccaaacaaccaaaccacccaacaacccaaccacc'
>>> ''.join(char_at_pos(i) for i in range(2**6-1)) == F(6)
True
But we can do better. We're effectively staring at the binary representation of k
, removing all trailing '1'
s and the next '0'
, then simply looking at the next bit to determine whether we've got an 'a'
or a 'c'
. Identifying the trailing 1s can be done by bit-operation trickery. This gives us the following semi-obfuscated loop-free solution, which I leave it to you to unwind:
def char_at_pos2(k):
"""
Return the character at position k of the string F(n), for any
n satisfying 2**n-1 > k.
"""
return 'ac'[k//(1+(k+1^k))%2]
Again, let's check:
>>> F(20) == ''.join(char_at_pos2(i) for i in range(2**20-1))
True
Final comments: this is a very well-known and well-studied sequence: it's called the dragon curve sequence, or the regular paper-folding sequence, and is sequence A014577 in the online encyclopaedia of integer sequences. Some Google searches will likely give you many other ways to compute its elements. See also this codegolf question.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With