Questions Linux Laravel Mysql Ubuntu Git Menu

HTML CSS JAVASCRIPT SQL PYTHON PHP BOOTSTRAP JAVA JQUERY R React Kotlin

Truncating column width in pandas

Tags:

python

pandas

I'm reading in large csv files into pandas some of them with String columns in the thousands of characters. Is there any quick way to limit the width of a column, i.e. only keep the first 100 characters?

like image

627

asked Apr 01 '14 17:04

Luke

People also ask

How do pandas reshape wide to long?

You can use the following basic syntax to convert a pandas DataFrame from a wide format to a long format: df = pd. melt(df, id_vars='col1', value_vars=['col2', 'col3', ...]) In this scenario, col1 is the column we use as an identifier and col2, col3, etc.

1 Answers

If you can read the whole thing into memory, you can use the str method for vector operations:

>>> df = pd.read_csv("toolong.csv")
>>> df
   a                       b  c
0  1  1256378916212378918293  2

[1 rows x 3 columns]
>>> df["b"] = df["b"].str[:10]
>>> df
   a           b  c
0  1  1256378916  2

[1 rows x 3 columns]

Also note that you can get a Series with lengths using

>>> df["b"].str.len()
0    10
Name: b, dtype: int64

I was originally wondering if

>>> pd.read_csv("toolong.csv", converters={"b": lambda x: x[:5]})
   a      b  c
0  1  12563  2

[1 rows x 3 columns]

would be better but I don't actually know if the converters are called row-by-row or after the fact on the whole column.

like image

78

answered Nov 14 '22 21:11

DSM

Sign in to Comment

Related questions
                            
                                Run a script to populate a django db
                            
                                How to query an HDF store using Pandas/Python
                            
                                sqlalchemy FULL OUTER JOIN
                            
                                delete an entry from a dictionary (Python)
                            
                                PRAW: Comment Submitter's Username
                            
                                How to convert standard timedelta string to timedelta object
                            
                                Passing a function and one of its arguments to another function in Python
                            
                                django [Errno 2] No such file or directory:
                            
                                can we access key and value in the ordereddict in python.?
                            
                                PyQT QTabWidget currentChanged
                            
                                python how to parse css file as key value [closed]
                            
                                stop python program when ssh pipe is broken
                            
                                defaultdict tuple of lists
                            
                                how to do a git diff of current commit with last commit using gitpython?
                            
                                Get all class names in a Python package
                            
                                What magic does staticmethod() do, so that the static method is always called without the instance parameter?
                            
                                Theano: Get matrix dimension and value of matrix (SharedVariable)
                            
                                How to rermove non-alphanumeric characters at the beginning or end of a string
                            
                                ImportError: No module named lxml on Mac
                            
                                Converting xls to csv in Python 3 using xlrd

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With