I have a pandas dataframe which has two columns key and value, and the value always consists of a 8 digit number something like
>df1
key value
10 10000100
20 10000000
30 10100000
40 11110000
Now I need to take the value column and split it on the digits present, such that my result is a new data frame
>df_res
key 0 1 2 3 4 5 6 7
10 1 0 0 0 0 1 0 0
20 1 0 0 0 0 0 0 0
30 1 0 1 0 0 0 0 0
40 1 1 1 1 0 0 0 0
I cannot change the input data format, the most conventional thing I thought was to convert the value to a string and loop through each digit char and put it in a list, however am looking for something more elegant and faster, kindly help.
EDIT: The input is not in string, it is integer.
This should work:
df.value.astype(str).apply(list).apply(pd.Series).astype(int)
One approach could be -
arr = df.value.values.astype('S8')
df = pd.DataFrame(np.fromstring(arr, dtype=np.uint8).reshape(-1,8)-48)
Sample run -
In [58]: df
Out[58]:
key value
0 10 10000100
1 20 10000000
2 30 10100000
3 40 11110000
In [59]: arr = df.value.values.astype('S8')
In [60]: pd.DataFrame(np.fromstring(arr, dtype=np.uint8).reshape(-1,8)-48)
Out[60]:
0 1 2 3 4 5 6 7
0 1 0 0 0 0 1 0 0
1 1 0 0 0 0 0 0 0
2 1 0 1 0 0 0 0 0
3 1 1 1 1 0 0 0 0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With