I have a Python Pandas DataFrame
object containing textual data. My problem is, that when I use to_html()
function, it truncates the strings in the output.
For example:
import pandas df = pandas.DataFrame({'text': ['Lorem ipsum dolor sit amet, consectetur adipiscing elit.']}) print (df.to_html())
The output is truncated at adapis...
<table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>text</th> </tr> </thead> <tbody> <tr> <th>0</th> <td> Lorem ipsum dolor sit amet, consectetur adipis...</td> </tr> </tbody> </table>
There is a related question on SO, but it uses placeholders and search/replace functionality to postprocess the HTML, which I would like to avoid:
Is there a simpler solution to this problem? I could not find anything related from the documentation.
Another option you have when it comes to removing unwanted parts from strings in pandas, is pandas. Series. str. extract() method that is used to extract capture groups in the regex pat as columns in a DataFrame.
Remove delimiter using split and str The str. split() function will give us a list of strings. The str[0] will allow us to grab the first element of the list. The assignment operator will allow us to update the existing column.
To drop a row or column in a dataframe, you need to use the drop() method available in the dataframe. You can read more about the drop() method in the docs here. Rows are labelled using the index number starting with 0, by default. Columns are labelled using names.
What you are seeing is pandas truncating the output for display purposes only.
The default max_colwidth
value is 50 which is what you are seeing.
You can set this value to whatever you desire or you can set it to -1 which effectively turns this off:
pd.set_option('display.max_colwidth', -1)
Although I would advise against this, it would be better to set it to something that can be displayed easily in your console or ipython.
A list of the options can be found here: http://pandas.pydata.org/pandas-docs/stable/options.html
it seems that pd.set_option('display.max_colwidth', -1)
is indeed the only option. To prevent irreversible global changes of how dataframes are presented in the console, you may save the previous setting in a variable and restore it immediately after the usage, as follows:
old_width = pd.get_option('display.max_colwidth') pd.set_option('display.max_colwidth', -1) open('some_file.html', 'w').write(some_data.to_html()) pd.set_option('display.max_colwidth', old_width)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With