Pandas - How to replace string with zero values in a DataFrame series?

Tags:

I'm importing some csv data into a Pandas DataFrame (in Python). One series is meant to be all numerical values. However, it also contains some spurious "$-" elements represented as strings. These have been left over from previous formatting. If I just import the series, Pandas reports it as a series of 'object'.

What's the best way to replace these "$-" strings with zeros? Or more generally, how can I replace all the strings in a series (which is predominantly numerical), with a numerical value, and convert the series to a floating point type?

Steve

735

asked Oct 30 '15 16:10

Steve Maughan

1 Answers

You can use the convert_objects method of the DataFrame, with convert_numeric=True to change the strings to NaNs

From the docs:

convert_numeric: If True, attempt to coerce to numbers (including strings), with unconvertible values becoming NaN.

In [17]: df
Out[17]: 
    a   b  c
0  1.  2.  4
1  sd  2.  4
2  1.  fg  5

In [18]: df2 = df.convert_objects(convert_numeric=True)

In [19]: df2
Out[19]: 
    a   b  c
0   1   2  4
1 NaN   2  4
2   1 NaN  5

Finally, if you want to convert those NaNs to 0's, you can use df.replace

In [20]: df2.replace('NaN',0)
Out[20]: 
   a  b  c
0  1  2  4
1  0  2  4
2  1  0  5

115

answered Nov 15 '22 02:11

tmdavison

Related questions
                            
                                Midrule in LaTeX output of Python Pandas
                            
                                No module named dateutil.parser [duplicate]
                            
                                Matplotlib - How do I set ylim() for a series of plots?
                            
                                Remove the duplicate values and sum the corresponding column values
                            
                                Kivy Layout height to adapt to child widgets's height
                            
                                Escaping "\n" new line in list comprehension vs for loop in Python
                            
                                Sqlalchemy - update column based on changes in another column
                            
                                Python: displaying a line of text outside a matplotlib chart
                            
                                Divide .csv file into chunks with Python
                            
                                What is the Laplacian mask/kernel used in the scipy.ndimage.filter.laplace()?
                            
                                Matplotlib Basemap Coastal Coordinates
                            
                                Why are you never supposed to reload modules? [duplicate]
                            
                                pygraphviz, ImportError: undefined symbol: Agundirected
                            
                                Python regex: Remove a pattern at the end of string
                            
                                How do you get the url from Submission object in PRAW?
                            
                                How to create groupby subplots in Pandas?
                            
                                Generate a list of 6 random numbers between 1 and 6 in python
                            
                                How to tell if a string has exactly 8 1's and 0's in it in python
                            
                                matplotlib conditional background color in python
                            
                                Concatenating numpy vector and matrix horizontally

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Pandas - How to replace string with zero values in a DataFrame series?

Tags:

python

pandas

dataframe

Steve Maughan

People also ask

1 Answers

tmdavison

Recent Activity

Donate For Us