Pandas offers some summary statistics with the describe()
function called on a DataFrame
. The output of the function is another DataFrame
, so it's easily exported to HTML with a call to to_html()
.
It also offers information about the DataFrame
with the info()
function, but that's printed out, returning None
. Is there a way to get the same information as a DataFrame
or any other way that can be exported to HTML?
Here is a sample info()
for reference:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5 entries, 0 to 4
Data columns (total 7 columns):
0 5 non-null float64
1 5 non-null float64
2 5 non-null float64
3 5 non-null float64
4 5 non-null float64
5 5 non-null float64
6 5 non-null float64
dtypes: float64(7)
memory usage: 360.0 bytes
A solution can be to save the output of info() to a writable buffer (using the buf argument) and then converting to html.
Below an example using a txt file as buffer, but this could be easily done in memory using StringIO
.
import pandas as pd
import numpy as np
frame = pd.DataFrame(np.random.randn(100, 3), columns =['A', 'B', 'C'])
_ = frame.info(buf = open('test_pandas.txt', 'w')) #save to txt
# Example to convert to html
contents = open("test_pandas.txt","r")
with open("test_pandas.html", "w") as e:
for lines in contents.readlines():
e.write("<pre>" + lines + "</pre> <br>\n")
Here's how the txt looks like:
The variation using StringIO can be found in @jezrael answer, so probably no point updating this answer.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With