I have a Pandas DataFrame in which one of the columns contains string elements, and those string elements contain new lines that I would like to print literally. But they just appear as <code>\n</code> in the output. That is, I want to print this: <pre class="prettyprint"><code> pos bidder 0 1 1 2 2 3 <- alice <- bob 3 4 </code></pre> but this is what I get: <pre class="prettyprint"><code> pos bidder 0 1 1 2 2 3 <- alice\n<- bob 3 4 </code></pre> How can I accomplish what I want? Can I use a DataFrame, or will I have to revert to manually printing padded columns one row at a time? Here's what I have so far: <pre class="prettyprint"><code>n = 4 output = pd.DataFrame({ 'pos': range(1, n+1), 'bidder': [''] * n }) bids = {'alice': 3, 'bob': 3} used_pos = [] for bidder, pos in bids.items(): if pos in used_pos: arrow = output.ix[pos, 'bidder'] output.ix[pos, 'bidder'] = arrow + "\n<- %s" % bidder else: output.ix[pos, 'bidder'] = "<- %s" % bidder print(output) </code></pre>

If you're trying to do this in ipython notebook, you can do: <pre class="prettyprint"><code>from IPython.display import display, HTML def pretty_print(df): return display( HTML( df.to_html().replace("\\n"," ") ) ) </code></pre>

Somewhat in line with unsorted's answer: <pre class="prettyprint"><code>import pandas as pd # Save the original `to_html` function to call it later pd.DataFrame.base_to_html = pd.DataFrame.to_html # Call it here in a controlled way pd.DataFrame.to_html = ( lambda df, *args, **kwargs: (df.base_to_html(*args, **kwargs) .replace(r"\n", " ")) ) </code></pre> This way, you don't need to call any explicit function in Jupyter notebooks, as <code>to_html</code> is called internally. If you want the original function, call <code>base_to_html</code> (or whatever you named it). I'm using <code>jupyter 1.0.0</code>, <code>notebook 5.7.6</code>.

Pretty printing newlines inside a string in a Pandas DataFrame

Tags:

python

string

python-3.x

pandas

printing

I have a Pandas DataFrame in which one of the columns contains string elements, and those string elements contain new lines that I would like to print literally. But they just appear as \n in the output.

That is, I want to print this:

  pos     bidder
0   1
1   2
2   3  <- alice
       <- bob
3   4

but this is what I get:

  pos            bidder
0   1
1   2
2   3  <- alice\n<- bob
3   4

How can I accomplish what I want? Can I use a DataFrame, or will I have to revert to manually printing padded columns one row at a time?

Here's what I have so far:

n = 4
output = pd.DataFrame({
    'pos': range(1, n+1),
    'bidder': [''] * n
})
bids = {'alice': 3, 'bob': 3}
used_pos = []
for bidder, pos in bids.items():
    if pos in used_pos:
        arrow = output.ix[pos, 'bidder']
        output.ix[pos, 'bidder'] = arrow + "\n<- %s" % bidder
    else:
        output.ix[pos, 'bidder'] = "<- %s" % bidder
print(output)

724

asked Dec 16 '15 21:12

shadowtalker

4 Answers

If you're trying to do this in ipython notebook, you can do:

from IPython.display import display, HTML

def pretty_print(df):
    return display( HTML( df.to_html().replace("\\n","<br>") ) )

answered Oct 17 '22 06:10

unsorted

Using pandas `.set_properties()` and CSS `white-space` property

[For use in IPython notebooks]

Another way will be to use pandas's pandas.io.formats.style.Styler.set_properties() method and the CSS "white-space": "pre-wrap" property:

from IPython.display import display

# Assuming the variable df contains the relevant DataFrame
display(df.style.set_properties(**{
    'white-space': 'pre-wrap',
})

To keep the text left-aligned, you might want to add 'text-align': 'left' as below:

from IPython.display import display

# Assuming the variable df contains the relevant DataFrame
display(df.style.set_properties(**{
    'text-align': 'left',
    'white-space': 'pre-wrap',
})

answered Oct 17 '22 08:10

yongjieyongjie

Somewhat in line with unsorted's answer:

import pandas as pd

# Save the original `to_html` function to call it later
pd.DataFrame.base_to_html = pd.DataFrame.to_html
# Call it here in a controlled way
pd.DataFrame.to_html = (
    lambda df, *args, **kwargs: 
        (df.base_to_html(*args, **kwargs)
           .replace(r"\n", "<br/>"))
)

This way, you don't need to call any explicit function in Jupyter notebooks, as to_html is called internally. If you want the original function, call base_to_html (or whatever you named it).

I'm using jupyter 1.0.0, notebook 5.7.6.

answered Oct 17 '22 07:10

Roger d'Amiens

From pandas.DataFrame documention:

Two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns). Arithmetic operations align on both row and column labels. Can be thought of as a dict-like container for Series objects. The primary pandas data structure

So you can't have a row without an index. Newline "\n" won't work in DataFrame.

You could overwrite 'pos' with an empty value, and output the next 'bidder' on the next row. But then index and 'pos' would be offset every time you do that. Like:

  pos    bidder
0   1          
1   2          
2   3  <- alice
3        <- bob
4   5

So if a bidder called 'frank' had 4 as value, it would overwrite 'bob'. This would cause problems as you add more. It is probably possible to use DataFrame and write code to work around this issue, but probably worth looking into other solutions.

Here is the code to produce the output structure above.

import pandas as pd

n = 5
output = pd.DataFrame({'pos': range(1, n + 1),
                      'bidder': [''] * n},
                      columns=['pos', 'bidder'])
bids = {'alice': 3, 'bob': 3}
used_pos = []
for bidder, pos in bids.items():
    if pos in used_pos:
        output.ix[pos, 'bidder'] = "<- %s" % bidder
        output.ix[pos, 'pos'] = ''
    else:
        output.ix[pos - 1, 'bidder'] = "<- %s" % bidder
        used_pos.append(pos)
print(output)

Edit:

Another option is to restructure the data and output. You could have pos as columns, and create a new row for each key/person in the data. In the code example below it prints the DataFrame with NaN values replaced with an empty string.

import pandas as pd

data = {'johnny\nnewline': 2, 'alice': 3, 'bob': 3,
        'frank': 4, 'lisa': 1, 'tom': 8}
n = range(1, max(data.values()) + 1)

# Create DataFrame with columns = pos
output = pd.DataFrame(columns=n, index=[])

# Populate DataFrame with rows
for index, (bidder, pos) in enumerate(data.items()):
    output.loc[index, pos] = bidder

# Print the DataFrame and remove NaN to make it easier to read.
print(output.fillna(''))

# Fetch and print every element in column 2
for index in range(1, 5):
    print(output.loc[index, 2])

It depends what you want to do with the data though. Good luck :)

answered Oct 17 '22 06:10

oystein-hr

Related questions
                            
                                Python Recursive Data Reading
                            
                                celery with multiple django instances
                            
                                Flask auto-reload and long-running thread
                            
                                Asynchronous versions of Google APIs?
                            
                                Python SUDS Error - SAXParseException
                            
                                Python PIL remove sections of an image based on its colour
                            
                                Python unittest testing MongoDB randomly fails
                            
                                differentiate mkvirtualenv and mkproject for virturalenvwrapper
                            
                                regular expression matching a string that is followed with another string without capturing the latter
                            
                                How to aggregate a boolean field with null values with pandas?
                            
                                Python vs Julia autocorrelation
                            
                                python pandas select both head and tail
                            
                                How to make a repeating generator in Python
                            
                                get value out of dataframe
                            
                                How to pass arguments to main function within Python module?
                            
                                Best practices for persistent database connections in Python when using Flask
                            
                                pytest: How to get a list of all failed tests at the end of the session? (and while using xdist)
                            
                                How To Solve KeyError: u"None of [Index([..], dtype='object')] are in the [columns]"
                            
                                Configure Django and Google Cloud Storage?
                            
                                How can I split a large file csv file (7GB) in Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Pretty printing newlines inside a string in a Pandas DataFrame

Tags:

python

string

python-3.x

pandas

printing

shadowtalker

People also ask

4 Answers

unsorted

Using pandas `.set_properties()` and CSS `white-space` property

yongjieyongjie

Roger d'Amiens

oystein-hr

Recent Activity

Donate For Us

Pretty printing newlines inside a string in a Pandas DataFrame

Tags:

python

string

python-3.x

pandas

printing

shadowtalker

People also ask

4 Answers

unsorted

Using pandas .set_properties() and CSS white-space property

yongjieyongjie

Roger d'Amiens

oystein-hr

Related questions

Recent Activity

Donate For Us

Using pandas `.set_properties()` and CSS `white-space` property