I have a table pandas DF which looks like
Slave | start_addr0 | end_addr0 | start_addr1 | end_addr1 | start_addr2 | end_addr2 | |
---|---|---|---|---|---|---|---|
0 | 0 | 10000000 | 1FFFFFFF | NaN | NaN | NaN | NaN |
1 | 1 | 20000000 | 2007FFFF | 40000000 | 40005FFF | NaN | NaN |
2 | 1 | 20000000 | 2007FFFF | 20100000 | 201FFFFF | NaN | NaN |
3 | 2 | 20200000 | 202FFFFF | 20080000 | 20085FFF | 40006000 | 400FFFFF |
4 | 3 | 0 | 0FFFFFFF | NaN | NaN | NaN | NaN |
5 | 4 | 20300000 | 203FFFFF | NaN | NaN | NaN | NaN |
6 | 5 | 20400000 | 204FFFFF | NaN | NaN | NaN | NaN |
For each slave number I need to convert it to a list of ranges (tuples). For example,
Slave1_list = ( (20000000, 2007FFFF), (40000000, 40005FFF), (20100000, 201FFFFF))
The number of slaves (rows) and address-pairs (columns) can vary.
Thanks
EDIT:
Run the following code to load sample data into dataframe:
import pandas as pd
import io
f = io.StringIO('''Slave|start_addr0|end_addr0|start_addr1|end_addr1|start_addr2|end_addr2
0|10000000|1FFFFFFF|NaN|NaN|NaN|NaN
1|20000000|2007FFFF|40000000|40005FFF|NaN|NaN
1|20000000|2007FFFF|20100000|201FFFFF|NaN|NaN
2|20200000|202FFFFF|20080000|20085FFF|40006000|400FFFFF
3|0|0FFFFFFF|NaN|NaN|NaN|NaN
4|20300000|203FFFFF|NaN|NaN|NaN|NaN
5|20400000|204FFFFF|NaN|NaN|NaN|NaN
''')
df = pd.read_csv(f, sep='|', engine='python', index_col=None)
You can try:
One option via wide_to_long
:
df = df.reset_index()
result = pd.wide_to_long(df, stubnames=['start_addr', 'end_addr'], i=['index', 'Slave'], j='add_num', sep='').dropna(
).reset_index([0, -1], drop=True).apply(tuple, 1).groupby(level=0).agg(list)
An option via groupby
:
k = df.set_index('Slave').stack().reset_index()
result = k.groupby(k.index//2).agg({'Slave': 'first', 0 : tuple}).groupby('Slave').agg({0 : set})
Explanation:
df.set_index('Slave').stack().reset_index()
will remove the NaN
values and stack the dataframe.
k.groupby(k.index//2)
will group alternate rows and perform the required aggregations(tuples are formed in this step)
.groupby('Slave').agg({0 : set})
-> Last groupby is to capture the unique tuples for each slave.
OUTPUT:
0
Slave
0 {(10000000, 1FFFFFFF)}
1 {(40000000.0, 40005FFF), (20100000.0, 201FFFFF), (20000000, 2007FFFF)}
2 {(20080000.0, 20085FFF), (40006000.0, 400FFFFF), (20200000, 202FFFFF)}
3 {(0, 0FFFFFFF)}
4 {(20300000, 203FFFFF)}
5 {(20400000, 204FFFFF)}
NOTE: I'm assuming for every start_addr
there exists an end_addr
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With