Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

translate with empty string

Tags:

python

syntax

This is an example from another answer that extracts only the lowercase letters. (Python 3)

import string
delete_table = string.maketrans(
    string.ascii_lowercase, ' ' * len(string.ascii_lowercase)
)
table = string.maketrans('', '')

"Agh#$%#%2341- -!zdrkfd".translate(table, delete_table)

In this case, the ' ' * len(string.ascii_lowercase) maps lowercase letters to an blank space. So my expectation is that the all the lowercase letters will be replaced with ' ', a blank space, but this is the output:

'ghzdrkfd'

So here are my questions:

  1. Why is the output different from my expectation?
  2. When I look at the documentation, translate only takes in one argument. Why is it passed two arguments?
like image 813
Forethinker Avatar asked Nov 02 '22 18:11

Forethinker


1 Answers

You have linked to the Python 3.x documentation, but if translate() is being used with multiple arguments then this code is probably from Python 2.x where that is valid. Here is the documentation.

As you can see there, the second argument is optional and it specifies characters that should be deleted from the input string (on Python 3.x you would do this by mapping the characters to None).

So for "Agh#$%#%2341- -!zdrkfd".translate(table, delete_table), first all characters present in delete_table are removed, and then a translation is performed using table.

Since delete_table is constructed using string.maketrans() translating all lowercase letters to space, it will be a string that contains every ASCII character except for lowercase letters:

>>> delete_table = string.maketrans(string.ascii_lowercase, ' '*len(string.ascii_lowercase))
>>> delete_table
'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`                          {|}~\x7f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff'
>>> ''.join(c for c in map(chr, range(0, 256)) if c not in delete_table)
'abcdefghijklmnopqrstuvwxyz'

So all other characters will be removed from the string, and then the translation with table will not modify anything since string.maketrans('', '') is used.

like image 106
Andrew Clark Avatar answered Nov 15 '22 05:11

Andrew Clark