Python tabstop-aware len() and padding functions

Question

Python's len() and padding functions like string.ljust() are not tabstop-aware, i.e. they treat ' ' like any other single-width character, and don't round len() up to the nearest multiple of tabstop. Example:

len('Bear	necessities	')

is 17 instead of 24 ( i.e. 4+(8-4)+11+(8-3) )

and say I also want a function pad_with_tabs(s) such that

pad_with_tabs('Bear', 15) = 'Bear		'

Looking for simple implementations of these - compactness and readability first, efficiency second. This is a basic but irritating question. @gnibbler - can you show a purely Pythonic solution, even if it's say 20x less efficient?

Sure you could convert back and forth using str.expandtabs(TABWIDTH), but that's clunky. Importing math to get TABWIDTH * int( math.ceil(len(s)*1.0/TABWIDTH) ) also seems like massive overkill.

I couldn't manage anything more elegant than the following:

TABWIDTH = 8

def pad_with_tabs(s,maxlen):
  s_len = len(s)
  while s_len < maxlen:
    s += '	'
    s_len += TABWIDTH - (s_len % TABWIDTH)
  return s

and since Python strings are immutable and unless we want to monkey-patch our function into string module to add it as a method, we must also assign to the result of the function:

s = pad_with_tabs(s, ...)

In particular I couldn't get clean approaches using list-comprehension or string.join(...):

''.join([s, '	' * ntabs])

without special-casing the cases where len(s) is < an integer multiple of TABWIDTH), or len(s)>=maxlen already.

Can anyone show better len() and pad_with_tabs() functions?

John La Rooy · Accepted Answer

TABWIDTH=8
def my_len(s):
    return len(s.expandtabs(TABWIDTH))

def pad_with_tabs(s,maxlen):
    return s+"	"*((maxlen-len(s)-1)/TABWIDTH+1)

Why did I use expandtabs()?
Well it's fast

$ python -m timeit '"Bear	necessities	".expandtabs()'
1000000 loops, best of 3: 0.602 usec per loop
$ python -m timeit 'for c in "Bear	necessities	":pass'
100000 loops, best of 3: 2.32 usec per loop
$ python -m timeit '[c for c in "Bear	necessities	"]'
100000 loops, best of 3: 4.17 usec per loop
$ python -m timeit 'map(None,"Bear	necessities	")'
100000 loops, best of 3: 2.25 usec per loop

Anything that iterates over your string is going to be slower, because just the iteration is ~4 times slower than expandtabs even when you do nothing in the loop.

$ python -m timeit '"Bear	necessities	".split("	")'
1000000 loops, best of 3: 0.868 usec per loop

Even just splitting on tabs takes longer. You'd still need to iterate over the split and pad each item to the tabstop

Python tabstop-aware len() and padding functions

Tags:

python

function

padding

string-length

tabstop

smci

1 Answers

John La Rooy

Recent Activity

Donate For Us

Python tabstop-aware len() and padding functions

Tags:

python

function

padding

string-length

tabstop

smci

1 Answers

John La Rooy

Related questions

Recent Activity

Donate For Us