Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace NaN with nearest value in a series of non-numeric object?

I'm using Pandas and Numpy and I'm trying to replace all NaN values in a Series like this one:

date                    a
2017-04-24 01:00:00  [1,0,0]
2017-04-24 01:20:00  [1,0,0]
2017-04-24 01:40:00  NaN
2017-04-24 02:00:00  NaN
2017-04-24 02:20:00  [0,1,0]
2017-04-24 02:40:00  [1,0,0]
2017-04-24 03:00:00  NaN
2017-04-24 03:20:00  [0,0,1]
2017-04-24 03:40:00  NaN
2017-04-24 04:00:00  [1,0,0]

with the nearest objcet (a Numpy array in this case). The result is:

date                    a
2017-04-24 01:00:00  [1,0,0]
2017-04-24 01:20:00  [1,0,0]
2017-04-24 01:40:00  [1,0,0]
2017-04-24 02:00:00  [0,1,0]
2017-04-24 02:20:00  [0,1,0]
2017-04-24 02:40:00  [1,0,0]
2017-04-24 03:00:00  [1,0,0]
2017-04-24 03:20:00  [0,0,1]
2017-04-24 03:40:00  [0,0,1]
2017-04-24 04:00:00  [1,0,0]

Does someone know an efficient method to do it? Many thanks.

like image 600
Alessandro Avatar asked Jul 18 '17 21:07

Alessandro


People also ask

How do I change NaN values in series?

This can be done by using the fillna() method. The basic operation of this pandas series. fillna() method is used to replace missing values (Nan or NA) with a specified value. Initially, the method verifies all the Nan values and replaces them with the assigned replacement value.

Which function would replace all Na NaN values of a series with the mean?

fillna() from the pandas' library, we can easily replace the 'NaN' in the data frame. Procedure: To calculate the mean() we use the mean function of the particular column. Now with the help of fillna() function we will change all 'NaN' of that particular column for which we have its mean.

How do you replace NaN with nothing pandas?

Convert Nan to Empty String in PandasUse df. replace(np. nan,'',regex=True) method to replace all NaN values to an empty string in the Pandas DataFrame column.


1 Answers

drop nulls then fill back up with reindex

df.set_index('date').a.dropna().reindex(df.date, method='nearest').reset_index()

                 date          a
0 2017-04-24 01:00:00  [1, 0, 0]
1 2017-04-24 01:20:00  [1, 0, 0]
2 2017-04-24 01:40:00  [1, 0, 0]
3 2017-04-24 02:00:00  [0, 1, 0]
4 2017-04-24 02:20:00  [0, 1, 0]
5 2017-04-24 02:40:00  [1, 0, 0]
6 2017-04-24 03:00:00  [0, 0, 1]
7 2017-04-24 03:20:00  [0, 0, 1]
8 2017-04-24 03:40:00  [1, 0, 0]
9 2017-04-24 04:00:00  [1, 0, 0]
like image 92
piRSquared Avatar answered Sep 29 '22 05:09

piRSquared