Is there a way to convert values like '34%' directly to int or float when using read_csv in pandas? I want '34%' to be directly read as 0.34 <ol> <li> Using this in <code>read_csv</code> did not work: <code>read_csv(..., dtype={'col':np.float})</code> </li> <li> After loading the csv as 'df' this also did not work with the error "invalid literal for float(): 34%" <code>df['col'] = df['col'].astype(float)</code> </li> <li> I ended up using this which works but is long winded: <code>df['col'] = df['col'].apply(lambda x: np.nan if x in ['-'] else x[:-1]).astype(float)/100</code> </li> </ol>

You were very close with your <code>df</code> attempt. Try changing: <pre class="prettyprint"><code>df['col'] = df['col'].astype(float) </code></pre> to: <pre class="prettyprint"><code>df['col'] = df['col'].str.rstrip('%').astype('float') / 100.0 # ^ use str funcs to elim '%' ^ divide by 100 # could also be: .str[:-1].astype(... </code></pre> Pandas supports Python's string processing functions on string columns. Just precede the string function you want with <code>.str</code> and see if it does what you need. (This includes string slicing, too, of course.) Above we utilize <code>.str.rstrip()</code> to get rid of the trailing percent sign, then we divide the array in its entirety by 100.0 to convert from percentage to actual value. For example, 45% is equivalent to 0.45. Although <code>.str.rstrip('%')</code> could also just be <code>.str[:-1]</code>, I prefer to explicitly remove the '%' rather than blindly removing the last char, just in case...

Convert percent string to float in pandas read_csv

2 Answers

You were very close with your df attempt. Try changing:

df['col'] = df['col'].astype(float)

to:

df['col'] = df['col'].str.rstrip('%').astype('float') / 100.0 #                     ^ use str funcs to elim '%'     ^ divide by 100 # could also be:     .str[:-1].astype(...

Pandas supports Python's string processing functions on string columns. Just precede the string function you want with .str and see if it does what you need. (This includes string slicing, too, of course.)

Above we utilize .str.rstrip() to get rid of the trailing percent sign, then we divide the array in its entirety by 100.0 to convert from percentage to actual value. For example, 45% is equivalent to 0.45.

Although .str.rstrip('%') could also just be .str[:-1], I prefer to explicitly remove the '%' rather than blindly removing the last char, just in case...

answered Sep 28 '22 09:09

Gary02127

You can define a custom function to convert your percents to floats at read_csv() time:

# dummy data temp1 = """index col  113 34% 122 50% 123 32% 301 12%"""  # Custom function taken from https://stackoverflow.com/questions/12432663/what-is-a-clean-way-to-convert-a-string-percent-to-a-float def p2f(x):     return float(x.strip('%'))/100  # Pass to `converters` param as a dict... df = pd.read_csv(io.StringIO(temp1), sep='\s+',index_col=[0], converters={'col':p2f}) df          col index       113    0.34 122    0.50 123    0.32 301    0.12  # Check that dtypes really are floats df.dtypes  col    float64 dtype: object

My percent to float code is courtesy of ashwini's answer: What is a clean way to convert a string percent to a float?

answered Sep 28 '22 09:09

EdChum

Related questions
                            
                                TkMessageBox - No Module
                            
                                How do I force a task on airflow to fail?
                            
                                How to control the mouse in Mac using Python?
                            
                                Generating Symmetric Matrices in Numpy
                            
                                Parsing date with timezone from an email?
                            
                                How to add third-party Java JAR files for use in PySpark
                            
                                python/pandas: convert month int to month name
                            
                                How to explain the int() function to a beginner
                            
                                sort csv by column
                            
                                usleep in Python
                            
                                networkx add_node with specific position
                            
                                How to install SimpleJson Package for Python
                            
                                How do I subtract two dates in Django/Python?
                            
                                How do you set a conditional in python based on datatypes?
                            
                                Writing UTF-8 String to MySQL with Python
                            
                                Bottle framework and OOP, using method instead of function
                            
                                Python - Download Images from google Image search?
                            
                                Running a Python script outside of Django
                            
                                differences between "d = dict()" and "d = {}"
                            
                                Possible to append multiple lists at once? (Python)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Convert percent string to float in pandas read_csv

Tags:

python

pandas

KieranPC

People also ask

2 Answers

Gary02127

EdChum

Recent Activity

Donate For Us