# necessary imports
from tabulate import tabulate
import pandas as pd
I have a dataframe:
df = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A3'],
'B': ['B0', 'B1', 'B2', 'B3'],
'C': ['C0', 'C1', 'C2', 'C3'],
'D': ['D0', 'D1', 'D2', 'D3']},
index=[0, 1, 2, 3])
Using this, I pretty print it:
prettyprint=tabulate(df, headers='keys', tablefmt='psql')
print(prettyprint)
Result:
+----+-----+-----+-----+-----+
| | A | B | C | D |
|----+-----+-----+-----+-----|
| 0 | A0 | B0 | C0 | D0 |
| 1 | A1 | B1 | C1 | D1 |
| 2 | A2 | B2 | C2 | D2 |
| 3 | A3 | B3 | C3 | D3 |
+----+-----+-----+-----+-----+
Saving it to a text file:
with open("PrettyPrintOutput.txt","w") as text_file:
text_file.wite(prettyprint)
How can I read PrettyPrintOutput.txt
back into a dataframe without doing a lot of text processing manually?
One solution is to use clever keyword arguments in pd.read_csv
/ pd.read_clipboard
:
df = pd.read_csv(r'PrettyPrintOutput.txt', sep='|', comment='+', skiprows=[2], index_col=1)
df = df[[col for col in df.columns if 'Unnamed' not in col]]
I just define all lines beginning with '+' as comments, so they don't get imported. This does not help against the third row, which has to be excluded using skiprow.
The second line is needed because you end up with additional columns using the '|' as separator. If you know the column names in advance use the keyword usecols
to be explicit.
Output:
A B C D
0 A0 B0 C0 D0
1 A1 B1 C1 D1
2 A2 B2 C2 D2
3 A3 B3 C3 D3
It also works with pd.read_clipboard
, because the functions accept the same keyword arguments.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With