I have a list of lists that represent a grid of data (think rows in a spreadsheet). Each row can have an arbitrary number of columns, and the data in each cell is a string of arbitrary length.
I want to normalize this to, in effect, make each row have the same number of columns and each column in the data have the same width, padding with spaces as necessary. For example, given the following input:
(
("row a", "a1","a2","a3"),
("another row", "b1"),
("c", "x", "y", "a long string")
)
I want the data to look like this:
(
("row a ", "a1", "a2", "a3 "),
("another row", "b1", " ", " "),
("c ", "x ", "y ", "a long string")
)
What's the pythonic solution for python 2.6 or greater? Just to be clear: I'm not looking to pretty-print the list per se, I'm looking for a solution that returns a new list of lists (or tuple of tuples) with the values padded out.
Starting with your input data:
>>> d = (
("row a", "a1","a2","a3"),
("another row", "b1"),
("c", "x", "y", "a long string")
)
Make one pass to determine the maximum size of each column:
>>> col_size = {}
>>> for row in d:
for i, col in enumerate(row):
col_size[i] = max(col_size.get(i, 0), len(col))
>>> ncols = len(col_size)
Then make a second pass to pad each column to the required width:
>>> result = []
>>> for row in d:
row = list(row) + [''] * (ncols - len(row))
for i, col in enumerate(row):
row[i] = col.ljust(col_size[i])
result.append(row)
That gives the desired result:
>>> from pprint import pprint
>>> pprint(result)
[['row a ', 'a1', 'a2', 'a3 '],
['another row', 'b1', ' ', ' '],
['c ', 'x ', 'y ', 'a long string']]
For convenience, the steps can be combined into a single function:
def align(array):
col_size = {}
for row in array:
for i, col in enumerate(row):
col_size[i] = max(col_size.get(i, 0), len(col))
ncols = len(col_size)
result = []
for row in array:
row = list(row) + [''] * (ncols - len(row))
for i, col in enumerate(row):
row[i] = col.ljust(col_size[i])
result.append(row)
return result
Here's what I came up with:
import itertools
def pad_rows(strs):
for col in itertools.izip_longest(*strs, fillvalue=""):
longest = max(map(len, col))
yield map(lambda x: x.ljust(longest), col)
def pad_strings(strs):
return itertools.izip(*pad_rows(strs))
And calling it like this:
print tuple(pad_strings(x))
yields this result:
(('row a ', 'a1', 'a2', 'a3 '),
('another row', 'b1', ' ', ' '),
('c ', 'x ', 'y ', 'a long string'))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With