I want to get interval margins of a column with pandas intervals and write them in columns 'left', 'right'. Iterrows does not work (documentation says it would not be use for writing data) and, anyway it would not be the better solution.
import pandas as pd
i1 = pd.Interval(left=85, right=94)
i2 = pd.Interval(left=95, right=104)
i3 = pd.Interval(left=105, right=114)
i4 = pd.Interval(left=115, right=124)
i5 = pd.Interval(left=125, right=134)
i6 = pd.Interval(left=135, right=144)
i7 = pd.Interval(left=145, right=154)
i8 = pd.Interval(left=155, right=164)
i9 = pd.Interval(left=165, right=174)
data = pd.DataFrame(
{
"intervals":[i1,i2,i3,i4,i5,i6,i7,i8,i9],
"left" :[0,0,0,0,0,0,0,0,0],
"right" :[0,0,0,0,0,0,0,0,0]
},
index=[0,1,2,3,4,5,6,7,8]
)
#this is not working (has no effect):
for index, row in data.iterrows():
print(row.intervals.left, row.intervals.right)
row.left = row.intervals.left
row.right = row.intervals.right
How can we do something like:
data['left']=data['intervals'].left
data['right']=data['intervals'].right
Thanks!
A pandas.arrays.IntervalArray, is the preferred way for storing interval data in Series -like structures. For @coldspeed's first example, IntervalArray is basically a drop in replacement:
Pandas provide 3 methods to handle white spaces(including New line) in any text data. As it can be seen in the name, str.lstrip() is used to remove spaces from the left side of string, str.rstrip() to remove spaces from right side of the string and str.strip() removes spaces from both sides.
Since you’re only interested to extract the five digits from the left, you may then apply the syntax of str [:5] to the ‘Identifier’ column: Once you run the Python code, you’ll get only the digits from the left: In this scenario, the goal is to get the five digits from the right:
As it can be seen in the name, str.lstrip () is used to remove spaces from the left side of string, str.rstrip () to remove spaces from right side of the string and str.strip () removes spaces from both sides.
Create an pandas.IntervalIndex
from your intervals. You can then access the .left
and .right
attributes.
import pandas as pd
idx = pd.IntervalIndex([i1, i2, i3, i4, i5, i6, i7, i8, i9])
pd.DataFrame({'intervals': idx, 'left': idx.left, 'right': idx.right})
intervals left right
0 (85, 94] 85 94
1 (95, 104] 95 104
2 (105, 114] 105 114
3 (115, 124] 115 124
4 (125, 134] 125 134
5 (135, 144] 135 144
6 (145, 154] 145 154
7 (155, 164] 155 164
8 (165, 174] 165 174
Another option is using map
and operator.attrgetter
(look ma, no lambda
...):
from operator import attrgetter
df['left'] = df['intervals'].map(attrgetter('left'))
df['right'] = df['intervals'].map(attrgetter('right'))
df
intervals left right
0 (85, 94] 85 94
1 (95, 104] 95 104
2 (105, 114] 105 114
3 (115, 124] 115 124
4 (125, 134] 125 134
5 (135, 144] 135 144
6 (145, 154] 145 154
7 (155, 164] 155 164
8 (165, 174] 165 174
A pandas.arrays.IntervalArray
, is the preferred way for storing interval data in Series
-like structures.
For @coldspeed's first example, IntervalArray
is basically a drop in replacement:
In [2]: pd.__version__
Out[2]: '1.1.3'
In [3]: ia = pd.arrays.IntervalArray([i1, i2, i3, i4, i5, i6, i7, i8, i9])
In [4]: df = pd.DataFrame({'intervals': ia, 'left': ia.left, 'right': ia.right})
In [5]: df
Out[5]:
intervals left right
0 (85, 94] 85 94
1 (95, 104] 95 104
2 (105, 114] 105 114
3 (115, 124] 115 124
4 (125, 134] 125 134
5 (135, 144] 135 144
6 (145, 154] 145 154
7 (155, 164] 155 164
8 (165, 174] 165 174
If you already have interval data in a Series
or DataFrame
, @coldspeed's second example becomes a bit more simple by accessing the array
attribute:
In [6]: df = pd.DataFrame({'intervals': ia})
In [7]: df['left'] = df['intervals'].array.left
In [8]: df['right'] = df['intervals'].array.right
In [9]: df
Out[9]:
intervals left right
0 (85, 94] 85 94
1 (95, 104] 95 104
2 (105, 114] 105 114
3 (115, 124] 115 124
4 (125, 134] 125 134
5 (135, 144] 135 144
6 (145, 154] 145 154
7 (155, 164] 155 164
8 (165, 174] 165 174
A simple way is to use apply() method:
data['left'] = data['intervals'].apply(lambda x: x.left)
data['right'] = data['intervals'].apply(lambda x: x.right)
data
intervals left right
0 (85, 94] 85 94
1 (95, 104] 95 104
...
8 (165, 174] 165 174
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With