I have the following pandas DataFrame.
import pandas as pd
df = pd.read_csv('filename.csv')
print(df)
sample column_A
0 sample1 6/6
1 sample2 0/4
2 sample3 2/6
3 sample4 12/14
4 sample5 15/21
5 sample6 12/12
.. ....
The values in column_A are not fractions, and these data must be manipulated such that I can convert each value into 0s and 1s (not convert the integers into their binary counterparts).
The "numerator" above gives the total number of 1s, while the "denominator" gives the total number of 0s and 1s together.
So, the table should actually be in the following format:
sample column_A
0 sample1 111111
1 sample2 0000
2 sample3 110000
3 sample4 11111111111100
4 sample5 111111111111111000000
5 sample6 111111111111
.. ....
I've never parsed an integer to output strings of 0s and 1s like this. How does one do this? Is there a "pandas method" to use with lambda expressions? Pythonic string parsing or regex?
First, suppose you write a function:
def to_binary(s):
n_d = s.split('/')
n, d = int(n_d[0]), int(n_d[1])
return '1' * n + '0' * (d - n)
So that,
>>> to_binary('4/5')
'11110'
Now you just need to use pandas.Series.apply:
df.column_A.apply(to_binary)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With