Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find the lowercase (un-shifted) form of symbols

Tags:

python

string

x = "Foo 890 bar *()"

How can I get the lowercase including the "*()" being 'unshifted' back to 890? Desired result:

foo 890 bar 890

Unwanted:

x.lower() => "foo 890 bar *()"
like image 451
DCA- Avatar asked May 11 '18 10:05

DCA-


2 Answers

The unshifting depends on keyboard layout. It's not a universal mapping. You could hardcode one.

unshift = {
    '!': '1', '@': '2', '#': '3', '$': '4', '%': '5',
    '^': '6', '&': '7', '*': '8', '(': '9', ')': '0',
}

x = ''.join(unshift.get(c, c.lower()) for c in x)

An alternative, more compact way to write that map would be:

unshift = dict(zip('!@#$%^&*()', '1234567890'))
like image 102
John Kugelman Avatar answered Nov 10 '22 06:11

John Kugelman


Python3

You could use str.translate if you want to avoid explicitly disassembling and assembling strings in your code.

(Assuming the same keyboard layout as in John Kugelman's answer).

You could subclass dict to handle the common case - a key's shifted and unshifted values are upper and lower case versions of the same character - automatically and use this as the translation table.

class Unshift(dict):

    def __missing__(self, key):
        """Given an ordinal, return corresponding lower case character"""
        return chr(key).lower()

specials = {
    '!': '1', '@': '2', '#': '3', '$': '4', '%': '5',
    '^': '6', '&': '7', '*': '8', '(': '9', ')': '0',
}

unshift = Unshift((ord(k), v) for k, v in specials.items())

>>> x.translate(unshift)
'foo 890 bar 890'

However this is performs slightly more slowly than John's approach - I expect because of the cost of the lookup misses and the calls to chr.

>>> timeit.timeit(setup='from __main__ import unshift, x', stmt='x.translate(unshift)')
7.9996025009895675

>>> timeit.timeit(setup='from __main__ import d, x', stmt='"".join(d.get(c, c.lower()) for c in x)')
7.469654283020645

Performance greatly improves if you can create a mapping with all combinations, avoiding the cost of failed lookups.

>>> import string
>>> # Example dict 
>>> d2 = {ord(c): c.lower() for c in (string.ascii_letters + string.digits)}
>>> d2.update((ord(k), v) for k, v in specials.items())

>>> timeit.timeit(setup='from __main__ import d2, x', stmt='x.translate(d2)')
0.8882806290057488

Python2

str.translate takes a translation table as its argument; the table must be constructed using string.maketrans.

>>> d2 = {c: c.lower() for c in (string.ascii_letters + string.digits)}
>>> d2.update(specials)
>>> items = sorted(d2.items())
>>> src, dest = ''.join(x[0] for x in items), ''.join(x[1] for x in items)
>>> tt = string.maketrans(src, dest)

>>> x.translate(tt)
'foo 890 bar 890'

str.translate in Python2 is faster than in Python3

>>> timeit.timeit(setup='from __main__ import tt, x', stmt='x.translate(tt)')
0.2270500659942627

however the string.maketrans and str.translate combination in Python2 doesn't seem to handle unicode very well, so they may not be suitable if you are dealing with international keyboards.

like image 28
snakecharmerb Avatar answered Nov 10 '22 08:11

snakecharmerb