Removing time series with only zero values from a data frame

Question

I have a data frame with multiple time series identified by uniquer id's. I would like to remove any time series that have only 0 values.

The data frame looks as follows,

id   date          value
AAA  2010/01/01    9
AAA  2010/01/02    10
AAA  2010/01/03    8
AAA  2010/01/04    4
AAA  2010/01/05    12
B    2010/01/01    0
B    2010/01/02    0
B    2010/01/03    0
B    2010/01/04    0
B    2010/01/05    0
CCC  2010/01/01    45
CCC  2010/01/02    46
CCC  2010/01/03    0
CCC  2010/01/04    0
CCC  2010/01/05    40

I want any time series with only 0 values to be removed so that the data frame look as follows,

id   date          value
AAA  2010/01/01    9
AAA  2010/01/02    10
AAA  2010/01/03    8
AAA  2010/01/04    4
AAA  2010/01/05    12
CCC  2010/01/01    45
CCC  2010/01/02    46
CCC  2010/01/03    0
CCC  2010/01/04    0
CCC  2010/01/05    40

This is a follow up to a previous question that was answered with a really great solution using the data.tables package.

R efficiently removing missing values from the start and end of multiple time series in 1 data frame

Matt Dowle · Accepted Answer

If dat is a data.table, then this is easy to write and read :

dat[,.SD[any(value!=0)],by=id]

.SD stands for Subset of Data. This answer explains .SD very well.

Picking up on Gabor's nice use of ave, but without repeating the same variable name (DF) three times, which can be a source of typo bugs if you have a lot of long or similar variable names, try :

dat[ ave(value!=0,id,FUN=any) ]

The difference in speed between those two may be dependent on several factors including: i) number of groups ii) size of each group and iii) the number of columns in the real dat.

G. Grothendieck · Answer

Try this. No packages are used.

DF[ ave(DF$value != 0, DF$id, FUN = any), ]

Removing time series with only zero values from a data frame

Tags:

dataframe

r

data.table

time-series

sizeight

2 Answers

Matt Dowle

G. Grothendieck

Recent Activity

Donate For Us

Removing time series with only zero values from a data frame

Tags:

dataframe

r

data.table

time-series

sizeight

2 Answers

Matt Dowle

G. Grothendieck

Related questions

Recent Activity

Donate For Us