Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Drop row in Pandas Series and clean up index

I have a Pandas Series and based on a random number I want to pick a row (5 in the code example below) and drop that row. When the row is dropped I want to create a new index for the remaining rows (0 to 8). The code below:

print 'Original series: ', sample_mean_series
print 'Length of original series', len(sample_mean_series)
sample_mean_series = sample_mean_series.drop([5],axis=0)
print 'Series with item 5 dropped: ', sample_mean_series
print 'Length of modified series:', len(sample_mean_series)
print sample_mean_series.reindex(range(len(sample_mean_series)))

And this is the output:

Original series:  
0    0.000074
1   -0.000067
2    0.000076
3   -0.000017
4   -0.000038
5   -0.000051
6    0.000125
7   -0.000108
8   -0.000009
9   -0.000052
Length of original series 10
Series with item 5 dropped:  
0    0.000074
1   -0.000067
2    0.000076
3   -0.000017
4   -0.000038
6    0.000125
7   -0.000108
8   -0.000009
9   -0.000052
Length of modified series: 9
0    0.000074
1   -0.000067
2    0.000076
3   -0.000017
4   -0.000038
5         NaN
6    0.000125
7   -0.000108
8   -0.000009

My problem is that the row number 8 is dropped. I want to drop row "5 NaN" and keep -0.000052 with an index 0 to 8. This is what I want it to look like:

0    0.000074
1   -0.000067
2    0.000076
3   -0.000017
4   -0.000038
5    0.000125
6   -0.000108
7   -0.000009
8   -0.000052
like image 445
Jonas Avatar asked Jan 23 '13 19:01

Jonas


2 Answers

Somewhat confusingly, reindex does not mean "create a new index". To create a new index, just assign to the index attribute. So at your last step just do sample_mean_series.index = range(len(sample_mean_series)).

like image 139
BrenBarn Avatar answered Nov 20 '22 18:11

BrenBarn


Here's a one-liner:

In [1]: s
Out[1]:
0   -0.942184
1    0.397485
2   -0.656745
3    1.415797
4    1.123858
5   -1.890870
6    0.401715
7   -0.193306
8   -1.018140
9    0.262998

I use the Series.drop method to drop row 5 and then use reset_index to re-number the indices to be consecutive. Without using reset_index, the indices would jump from 4 to 6 with no 5.

By default, reset_index will move the original index into a DataFrame and return it alongside the series values. Passing drop=True prevents this from happening.

In [2]: s2 = s.drop([5]).reset_index(drop=True)

In [3]: s2
Out[3]:
0   -0.942184
1    0.397485
2   -0.656745
3    1.415797
4    1.123858
5    0.401715
6   -0.193306
7   -1.018140
8    0.262998
Name: 0
like image 15
Zelazny7 Avatar answered Nov 20 '22 19:11

Zelazny7