I have the following pandas DataFrame.
import pandas as pd
df = pd.read_csv('filename.csv')
print(df)
sample column_A
0 sample1 6/6
1 sample2 0/4
2 sample3 2/6
3 sample4 12/14
4 sample5 15/21
5 sample6 12/12
.. ....
The values in column_A
are not fractions, and these data must be manipulated such that I can convert each value into 0s
and 1s
(not convert the integers into their binary counterparts).
The "numerator" above gives the total number of 1s
, while the "denominator" gives the total number of 0s
and 1s
together.
So, the table should actually be in the following format:
sample column_A
0 sample1 111111
1 sample2 0000
2 sample3 110000
3 sample4 11111111111100
4 sample5 111111111111111000000
5 sample6 111111111111
.. ....
I've never parsed an integer to output strings of 0s and 1s like this. How does one do this? Is there a "pandas method" to use with lambda
expressions? Pythonic string parsing or regex?
First, suppose you write a function:
def to_binary(s):
n_d = s.split('/')
n, d = int(n_d[0]), int(n_d[1])
return '1' * n + '0' * (d - n)
So that,
>>> to_binary('4/5')
'11110'
Now you just need to use pandas.Series.apply
:
df.column_A.apply(to_binary)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With