This has been stumping me for a while and I feel like there has to be a solution since printing a dataframe always aligns the columns headers with their respective values.
example:
df = pd.DataFrame({'First column name': [1234, 2345, 3456], 'Second column name': [5432,4321,6543], 'Third column name': [1236,3457,3568]})
df_string = df.to_string(justify='left', col_space='30')
now when you print df_string, you get the desired formatting:
but when I take the string and view it (in this case, I'm passing the string to a PyQt widget that displays text), this is the output:
(this is how the string appears on my console):
Any help is greatly appreciated.
This lines up column headers nicely:
print(df.to_string())
But this prints indices too. If you don't want to print the indices, you can:
print(df.to_string(index=False)
Problem is, the column headers no longer line up correctly.
So I wrote this hack:
blanks = r'^ *([a-zA-Z_0-9-]*) .*$'
blanks_comp = re.compile(blanks)
def find_index_in_line(line):
index = 0
spaces = False
for ch in line:
if ch == ' ':
spaces = True
elif spaces:
break
index += 1
return index
def pretty_to_string(df):
lines = df.to_string().split('\n')
header = lines[0]
m = blanks_comp.match(header)
indices = []
if m:
st_index = m.start(1)
indices.append(st_index)
non_header_lines = lines[1:len(lines)]
for line in non_header_lines:
index = find_index_in_line(line)
indices.append(index)
mn = np.min(indices)
newlines = []
for l in lines:
newlines.append(l[mn:len(l)])
return '\n'.join(newlines)
Which you invoke like this:
print(pretty_to_string(df))
The code works by calling df.to_string() (where columns are lined up nicely) and calculates the max # of characters taken up by the index column.
It then strips off the indices from each line.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With