pandas cut: how to convert categorical labels to strings (otherwise cannot export to Excel)?

Tags:

I use pandas.cut() to discretise a continuous variable into a range, and then group by the result.

After a lot of swearing because I couldn't figure out what was wrong, I have learnt that, if I don't supply custom labels to the cut() function, but rely on the default, then the output cannot be exported to excel. If I try this:

import pandas as pd
import numpy as np    

writer = pd.ExcelWriter('test.xlsx')
wk = writer.book.add_worksheet('Test')

df= df= pd.DataFrame(np.random.randint(1,10,(10000,5)), columns=['a','b','c','d','e'])
df['range'] = pd.cut( df['a'],[-np.inf,3,8,np.inf] )
grouped=df.groupby('range').sum()
grouped.to_excel(writer, 'Export')
writer.close()

I get:

raise TypeError("Unsupported type %s in write()" % type(token))
TypeError: Unsupported type <class 'pandas._libs.interval.Interval'> in write()
which it took me a while to decypher.

If instead I do assign labels:

df['range'] = pd.cut( df['a'],[-np.inf,3,8,np.inf], labels =['<3','3-8','>8'] )

then it all runs fine. Any suggestions on how to handle this without assigning custom labels? In the initial phase of my work I tend not to assign labels, because I still don't know how many bins I want - it's a trial and error approach, and assigning labels at each attempt would be time-consuming.

I am not sure if this can count as a bug, but at the very least it seems like a poorly documented annoyance!

279

asked Oct 16 '17 16:10

Pythonista anonymous

1 Answers

Use astype(str):

writer = pd.ExcelWriter('test.xlsx')
wk = writer.book.add_worksheet('Test')

df= df= pd.DataFrame(np.random.randint(1,10,(10000,5)), columns=['a','b','c','d','e'])
df['range'] = pd.cut( df['a'],[-np.inf,3,8,np.inf] ).astype(str)
grouped=df.groupby('range').sum()
grouped.to_excel(writer, 'Export')
writer.close()

Output in excel:

range   a   b   c   d   e
(-inf, 3.0] 6798    17277   16979   17266   16949
(3.0, 8.0]  33150   28051   27551   27692   27719
(8.0, inf]  9513    5153    5318    5106    5412

enter image description here

121

answered Oct 05 '22 07:10

Scott Boston

Related questions
                            
                                What part of speech does "s" stand for in WordNet synsets
                            
                                selenium.common.exceptions.WebDriverException: Message: 'Can not connect to GhostDriver'
                            
                                multiprocessing.Process.is_alive() returns True although process has finished, why?
                            
                                argparse argument dependency
                            
                                Multiprocessing of shared list
                            
                                How to Zoom with Axes3D in Matplotlib
                            
                                Why does python print version info to stderr?
                            
                                How to aggregate matching pairs into "connected components" in Python
                            
                                weights option for seaborn distplot?
                            
                                how to add href link in email content when sending email through smtplib
                            
                                PyCharm asks for python interpreter every time project is loaded
                            
                                Python Pandas inferring column datatypes
                            
                                TensorFlow in production for real time predictions in high traffic app - how to use?
                            
                                Numpy flatten RGB image array
                            
                                Matplotlib histogram from numpy histogram output [duplicate]
                            
                                How to check if you are in a Jupyter notebook
                            
                                Implementation of multiprocessing.Queue and queue.Queue
                            
                                `.loc` and `.iloc` with MultiIndex'd DataFrame
                            
                                How to run multiple graphs in a Session - Tensorflow API
                            
                                How to run python program in IOS Swift app

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

pandas cut: how to convert categorical labels to strings (otherwise cannot export to Excel)?

Tags:

python

pandas

dataframe

export-to-excel

Pythonista anonymous

People also ask

1 Answers

Scott Boston

Recent Activity

Donate For Us