Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do you add a value to a float index of a dataframe for every other row?

I'm recording data at 2000 Hz, which means every 0.5 milliseconds I have another data point. But my recording software only records with 1 millisecond precision, so that means I have duplicate values in my dataframe index which uses type float.

So in order to fix the duplicates I want to add 0.005 to every other row of the index. I tried this, but so far it doesn't work:

c = df.iloc[:,0] # select the first column of the dataframe
c = c.iloc[::-1]  # reverse order so that time is increasing not decreasing
pd.set_option('float_format', '{:f}'.format) # change the print output to show the decimals (instead of 15.55567E9)
i = c.index # get the index of c - the length is 20000
rp = np.matlib.repmat([0, 0.0005], 1, 10000) # create an array to repeat .0005 0 so that we can add 0.005 to every other row
df.set_index(c, i+rp).astype(float).applymap('{:,.4f}'.format) # set the index of c to i+rp - attempt to format to 4 decimals
print(c) # see if it worked

Expected output: (trimmed to save space - not showing all 20,000 rows)

1555677243.401000   4.569000
1555677243.401500   4.569000
1555677243.402000   4.571000
1555677243.402500   4.574000
1555677243.403000   4.574000
1555677243.403500   4.576000
1555677243.404000   4.577000
1555677243.404500   4.577000
1555677243.405000   4.577000
1555677243.405500   4.581000
1555677243.406000   4.581000
1555677243.406500   4.582000
1555677243.407000   4.581000
1555677243.407500   4.582000
1555677243.408000   4.580000
1555677243.408500   4.580000
1555677243.409000   4.582000
1555677243.409500   4.585000
1555677243.410000   4.585000
1555677243.410500   4.585000

Actual output: (notice duplicates in the index)

1555677243.401000   4.569000
1555677243.401000   4.569000
1555677243.402000   4.571000
1555677243.402000   4.574000
1555677243.403000   4.574000
1555677243.403000   4.576000
1555677243.404000   4.577000
1555677243.404000   4.577000
1555677243.405000   4.577000
1555677243.405000   4.581000
1555677243.406000   4.581000
1555677243.406000   4.582000
1555677243.407000   4.581000
1555677243.407000   4.582000
1555677243.408000   4.580000
1555677243.408000   4.580000
1555677243.409000   4.582000
1555677243.409000   4.585000
1555677243.410000   4.585000
1555677243.410000   4.585000
like image 581
aguazul Avatar asked Apr 24 '19 23:04

aguazul


3 Answers

df = pd.DataFrame({'A': [1,2,3,4,5,6,7,8,9],
                   'B': [1,2,3,4,5,6,7,8,9]})

df.iloc[1::2, 1] = df.iloc[1::2, :].eval('B + 0.005')

    A     B
0   1   1.000
1   2   2.005
2   3   3.000
3   4   4.005
4   5   5.000
5   6   6.005
6   7   7.000
7   8   8.005
8   9   9.000

Just have to make sure that your picking the correct column with the initial iloc. [1::2] is every other starting from index 1 (so 1,3 ect). You need to select all the columns in the second iloc due to eval only working with df's and not series. Then you can set that column to index as you did in your code.

like image 170
Ben Pap Avatar answered Oct 13 '22 00:10

Ben Pap


IIUC Data from gmds

df.index+=np.arange(len(df))%2*0.0005
df
        0
0.0000  0
0.0015  1
0.0020  2
0.0035  3
0.0040  4
0.0055  5
0.0060  6
0.0075  7
0.0080  8
0.0095  9
like image 34
BENY Avatar answered Oct 13 '22 00:10

BENY


You can pull the index out, convert it to a Series, modify it, and put it back in (Indexes are immutable):

import pandas as pd

df = pd.DataFrame(list(range(10)), index=[x/ 1000 for x in range(10)])

new_index = df.index.to_series()
new_index[::2] += 0.0005
result = df.set_index(new_index)
print(result)

Output:

        0
0.0005  0
0.0010  1
0.0025  2
0.0030  3
0.0045  4
0.0050  5
0.0065  6
0.0070  7
0.0085  8
0.0090  9
like image 27
gmds Avatar answered Oct 12 '22 22:10

gmds