Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to format IPython html display of Pandas dataframe?

How can I format IPython html display of pandas dataframes so that

  1. numbers are right justified
  2. numbers have commas as thousands separator
  3. large floats have no decimal places

I understand that numpy has the facility of set_printoptions where I can do:

int_frmt:lambda x : '{:,}'.format(x) np.set_printoptions(formatter={'int_kind':int_frmt}) 

and similarly for other data types.

But IPython does not pick up these formatting options when displaying dataframes in html. I still need to have

pd.set_option('display.notebook_repr_html', True) 

but with 1, 2, 3 as in above.

Edit: Below is my solution for 2 & 3 ( not sure this is the best way ), but I still need to figure out how to make number columns right justified.

from IPython.display import HTML int_frmt = lambda x: '{:,}'.format(x) float_frmt = lambda x: '{:,.0f}'.format(x) if x > 1e3 else '{:,.2f}'.format(x) frmt_map = {np.dtype('int64'):int_frmt, np.dtype('float64'):float_frmt} frmt = {col:frmt_map[df.dtypes[col]] for col in df.columns if df.dtypes[col] in frmt_map.keys()} HTML(df.to_html(formatters=frmt)) 
like image 883
behzad.nouri Avatar asked Sep 18 '13 15:09

behzad.nouri


People also ask

How do I export pandas Dataframe to HTML?

To render a Pandas DataFrame to HTML Table, use pandas. DataFrame. to_html() method. The total DataFrame is converted to <table> html element, while the column names are wrapped under <thead> table head html element.

How do I display pandas Dataframe in Jupyter?

You can visualize a pandas dataframe in Jupyter notebooks by using the display(<dataframe-name>) function. The display() function is supported only on PySpark kernels. The Qviz framework supports 1000 rows and 100 columns. For example, you have a pandas dataframe df that reads a .


2 Answers

HTML receives a custom string of html data. Nobody forbids you to pass in a style tag with the custom CSS style for the .dataframe class (which the to_html method adds to the table).

So the simplest solution would be to just add a style and concatenate it with the output of the df.to_html:

style = '<style>.dataframe td { text-align: right; }</style>' HTML( style + df.to_html( formatters=frmt ) ) 

But I would suggest to define a custom class for a DataFrame since this will change the style of all the tables in your notebook (style is "global").

style = '<style>.right_aligned_df td { text-align: right; }</style>' HTML(style + df.to_html(formatters=frmt, classes='right_aligned_df')) 

You can also define the style in one of the previous cells, and then just set the classes parameter of the to_html method:

# Some cell at the begining of the notebook In [2]: HTML('''<style>                     .right_aligned_df td { text-align: right; }                     .left_aligned_df td { text-align: right; }                     .pink_df { background-color: pink; }                 </style>''')  ...  # Much later in your notebook In [66]: HTML(df.to_html(classes='pink_df')) 
like image 108
Viktor Kerkez Avatar answered Sep 20 '22 23:09

Viktor Kerkez


This question was asked a long time ago. Back then, pandas didn't yet include pd.Styler. It was added in version 0.17.1.

Here's how you would use this to achieve your desired goal and some more:

  • Center the header
  • right-align any number columns
  • left-align the other columns.
  • Add a formatter for the numeric columns like you want
  • make it so that each column has the same width.

Here's some example data:

In [1]: df = pd.DataFrame(np.random.rand(10,3)*2000, columns=['A','B','C']) df['D'] = np.random.randint(0,10000,size=10) df['TextCol'] = np.random.choice(['a','b','c'], 10) df.dtypes  Out[1]: A          float64 B          float64 C          float64 D            int64 TextCol     object dtype: object 

Let's format this using df.style:

# Construct a mask of which columns are numeric numeric_col_mask = df.dtypes.apply(lambda d: issubclass(np.dtype(d).type, np.number))  # Dict used to center the table headers d = dict(selector="th",     props=[('text-align', 'center')])  # Style df.style.set_properties(subset=df.columns[numeric_col_mask], # right-align the numeric columns and set their width                         **{'width':'10em', 'text-align':'right'})\         .set_properties(subset=df.columns[~numeric_col_mask], # left-align the non-numeric columns and set their width                         **{'width':'10em', 'text-align':'left'})\         .format(lambda x: '{:,.0f}'.format(x) if x > 1e3 else '{:,.2f}'.format(x), # format the numeric values                 subset=pd.IndexSlice[:,df.columns[numeric_col_mask]])\         .set_table_styles([d]) # center the header 

Result using pd.Styler


Note that instead of calling .format on the subset columns, you can very well set the global default pd.options.display.float_format instead:

pd.options.display.float_format = lambda x: '{:,.0f}'.format(x) if x > 1e3 else '{:,.2f}'.format(x) 
like image 25
Julien Marrec Avatar answered Sep 20 '22 23:09

Julien Marrec