Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python 3.6+ logger to log pandas dataframe - how to indent the entire dataframe?

I need to use python logging module to log pandas dataframe. I need the entire dataframe (all rows) indented equally.

Below is the simple desired output:

Test Dataframe Output Below:

       col1  col2
    0     1     3
    1     2     4

However, I am getting the following output where the indentation is only applied to the first row of the dataframe:

Test Dataframe Output Below:

       col1  col2
0     1     3
1     2     4

Sample code I am running is:

import pandas as pd
import logging

# sample dataframe
test_df = pd.DataFrame({'col1': [1, 2], 'col2': [3, 4]})

# logging set up
logging.basicConfig(level=logging.INFO)
logging.getLogger().handlers.clear()
c_handler = logging.StreamHandler()
c_handler.setFormatter(logging.Formatter('%(message)s'))
logging.getLogger().addHandler(c_handler)

# log the pandas dataframe to console
logging.info(f'\tTest Dataframe Output Below:')
logging.info(f'\n\t\t{test_df}')
logging.info(f'{test_df}')

Any help will be greatly appreciated!

like image 904
user32147 Avatar asked Apr 19 '19 02:04

user32147


People also ask

How do I display the whole pandas Dataframe?

Use pandas. Call pandas. set_option("display. max_rows", max_rows, "display. max_columns", max_cols) with both max_rows and max_cols as None to set the maximum number of rows and columns to display to unlimited, allowing the full DataFrame to be displayed when printed.

How do I print a Dataframe neatly?

You can use the print() method to print the dataframe in a table format. You can convert the dataframe to String using the to_string() method and pass it to the print method which will print the dataframe.

What is the best way to iterate through a Dataframe?

Vectorization is always the first and best choice. You can convert the data frame to NumPy array or into dictionary format to speed up the iteration workflow. Iterating through the key-value pair of dictionaries comes out to be the fastest way with around 280x times speed up for 20 million records.


1 Answers

logging.info('\t'+ test_df.to_string().replace('\n', '\n\t')) 
like image 61
Ke Zhang Avatar answered Oct 24 '22 21:10

Ke Zhang