Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas to_html() truncates string contents

I have a Python Pandas DataFrame object containing textual data. My problem is, that when I use to_html() function, it truncates the strings in the output.

For example:

import pandas df = pandas.DataFrame({'text': ['Lorem ipsum dolor sit amet, consectetur adipiscing elit.']}) print (df.to_html()) 

The output is truncated at adapis...

<table border="1" class="dataframe">   <thead>     <tr style="text-align: right;">       <th></th>       <th>text</th>     </tr>   </thead>   <tbody>     <tr>       <th>0</th>       <td> Lorem ipsum dolor sit amet, consectetur adipis...</td>     </tr>   </tbody> </table> 

There is a related question on SO, but it uses placeholders and search/replace functionality to postprocess the HTML, which I would like to avoid:

  • Writing full contents of Pandas dataframe to HTML table

Is there a simpler solution to this problem? I could not find anything related from the documentation.

like image 942
Timo Avatar asked Oct 09 '14 11:10

Timo


People also ask

How do I remove part of a string in pandas?

Another option you have when it comes to removing unwanted parts from strings in pandas, is pandas. Series. str. extract() method that is used to extract capture groups in the regex pat as columns in a DataFrame.

How do I get rid of delimiter in pandas?

Remove delimiter using split and str The str. split() function will give us a list of strings. The str[0] will allow us to grab the first element of the list. The assignment operator will allow us to update the existing column.

How do I reduce the number of rows in pandas?

To drop a row or column in a dataframe, you need to use the drop() method available in the dataframe. You can read more about the drop() method in the docs here. Rows are labelled using the index number starting with 0, by default. Columns are labelled using names.


2 Answers

What you are seeing is pandas truncating the output for display purposes only.

The default max_colwidth value is 50 which is what you are seeing.

You can set this value to whatever you desire or you can set it to -1 which effectively turns this off:

pd.set_option('display.max_colwidth', -1) 

Although I would advise against this, it would be better to set it to something that can be displayed easily in your console or ipython.

A list of the options can be found here: http://pandas.pydata.org/pandas-docs/stable/options.html

like image 195
EdChum Avatar answered Oct 04 '22 11:10

EdChum


it seems that pd.set_option('display.max_colwidth', -1) is indeed the only option. To prevent irreversible global changes of how dataframes are presented in the console, you may save the previous setting in a variable and restore it immediately after the usage, as follows:

    old_width = pd.get_option('display.max_colwidth')     pd.set_option('display.max_colwidth', -1)     open('some_file.html', 'w').write(some_data.to_html())     pd.set_option('display.max_colwidth', old_width) 
like image 32
Boris Gorelik Avatar answered Oct 04 '22 09:10

Boris Gorelik