How to swap the 0 and 1 values for each other in a pandas data frame?

Tags:

I am working with a pandas dataframe that has a column of all 0's and 1's and I am trying to switch each of the values (ie all of the 0's become 1's and all of the 1's become 0's). Is there an easy way to do this?

476

asked Jul 14 '17 04:07

jharkins

2 Answers

Use replace:

df = df.replace({0:1, 1:0})

Or faster numpy.logical_xor:

df = np.logical_xor(df,1).astype(int)

Or more faster:

df = pd.DataFrame(np.logical_xor(df.values,1).astype(int),columns=df.columns, index=df.index)

Sample:

np.random.seed(12)
df = pd.DataFrame(np.random.choice([0,1], size=[10,3]))
print (df)
   0  1  2
0  1  1  0
1  1  1  0
2  1  1  0
3  0  0  1
4  0  1  1
5  1  0  1
6  0  0  0
7  1  0  0
8  1  0  1
9  1  0  0

df = df.replace({0:1, 1:0})
print (df)
   0  1  2
0  0  0  1
1  0  0  1
2  0  0  1
3  1  1  0
4  1  0  0
5  0  1  0
6  1  1  1
7  0  1  1
8  0  1  0
9  0  1  1

Another solution:

df = (~df.astype(bool)).astype(int)
print (df)
   0  1  2
0  0  0  1
1  0  0  1
2  0  0  1
3  1  1  0
4  1  0  0
5  0  1  0
6  1  1  1
7  0  1  1
8  0  1  0
9  0  1  1

Timings:

np.random.seed(12)
df = pd.DataFrame(np.random.choice([0,1], size=[10000,10000]))
print (df)

In [69]: %timeit (np.logical_xor(df,1).astype(int))
1 loop, best of 3: 1.42 s per loop

In [70]: %timeit (df ^ 1)
1 loop, best of 3: 2.53 s per loop

In [71]: %timeit ((~df.astype(bool)).astype(int))
1 loop, best of 3: 1.81 s per loop

In [72]: %timeit (df.replace({0:1, 1:0}))
1 loop, best of 3: 5.08 s per loop

In [73]: %timeit pd.DataFrame(np.logical_xor(df.values,1).astype(int), columns=df.columns, index=df.index)
1 loop, best of 3: 350 ms per loop

Edit: This should be faster:

import numexpr as ne
arr = df.values
df = pd.DataFrame(ne.evaluate('1 - arr'),columns=df.columns, index=df.index)

142

answered Oct 26 '22 07:10

jezrael

One easy way would be -

df[:] = 1-df.values

For performance, we might want to work with underlying array data, for a modified version like so -

a = df.values
a[:] = 1-a

Sample run -

In [43]: df
Out[43]: 
   0  1  2
0  0  0  1
1  0  0  1
2  0  0  1
3  1  1  0
4  1  0  0

In [44]: df[:] = 1-df.values

In [45]: df
Out[45]: 
   0  1  2
0  1  1  0
1  1  1  0
2  1  1  0
3  0  0  1
4  0  1  1

Using @jezrael's timings setup with the best solution from that setup for comparison against the one proposed in this post -

In [46]: np.random.seed(12)
    ...: df = pd.DataFrame(np.random.choice([0,1], size=[10000,10000]))
    ...: 

# Proposed in this post
In [47]: def swap_0_1(df):
    ...:     a = df.values
    ...:     a[:] = 1-a
    ...:     

In [48]: %timeit pd.DataFrame(np.logical_xor(df.values,1).astype(int), columns=df.columns, index=df.index)
10 loops, best of 3: 218 ms per loop

In [49]: %timeit swap_0_1(df)
10 loops, best of 3: 198 ms per loop

Or even better to use the negation of the boolean version of input array data -

In [60]: def swap_0_1_bool(df):
    ...:     a = df.values
    ...:     a[:] = ~a.astype(bool)
    ...:     

In [63]: %timeit swap_0_1_bool(df)
10 loops, best of 3: 179 ms per loop

answered Oct 26 '22 08:10

Divakar

Related questions
                            
                                Missing dll files when using pyinstaller
                            
                                Python: How to catch inner exception of exception chain?
                            
                                how to find the complement of two dataframes
                            
                                Vocabulary Processor function
                            
                                I have a RSA public key exponent and modulus. How can I encrypt a string using Python?
                            
                                Transform a set of numbers in numpy so that each number gets converted into a number of other numbers which are less than it
                            
                                Pivot table subtotals in Pandas
                            
                                I get 'continuation line under-indented for visual indent' error
                            
                                ImportError: No module named _ctypes. Google app engine with bokeh plot
                            
                                Creating pandas dataframe from a list of strings
                            
                                When I do pip --version it show the error as ImportError: No module named pyparsing
                            
                                Creating/Uploading new file at Google Cloud Storage bucket using Python
                            
                                Python - Trying to create a dictionary through a for loop
                            
                                Pandas DataFrame Read Skipping line XXX: expected X fields, saw Y
                            
                                Running a .sql file after migrations in django
                            
                                Calculate percentile for every value in a column of dataframe
                            
                                How to deal with NaN value when plot boxplot using python
                            
                                Graphene resolver for an object that has no model
                            
                                Is there any python operator that equivalent to javascript triple equal?
                            
                                How to achieve the reverse of "attr.asdict(MyObject)" using Python module 'attrs'

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to swap the 0 and 1 values for each other in a pandas data frame?

Tags:

python

pandas

dataframe

numpy

jharkins

People also ask

2 Answers

jezrael

Divakar

Recent Activity

Donate For Us