Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find first non-zero occurrence in dataframe

Tags:

r

I have a time-series of Sales by Account ID. To calculate average growth, I need to extract the first month with non-zero sales for each ID. Since the account could have been established at different times, I need to dynamically identify when sales > 0 for the first time in the account.

The index to the row would be sufficient for me to pass to a function calculating growth. So I expect the following results by Account ID:

54 - [1]
87 - [4]
95 - [2]

I tried `apply(df$Sales,2,match,x>0)`  but this doesn't work.

Any pointers? Alternatively, is there an easier way to compute CAGR with this dataset?

Thanks in advance!

CalendarMonth   ID  Sales
8/1/2008    54  6692.60274
9/1/2008    54  6476.712329
10/1/2008   54  6692.60274
11/1/2008   54  6476.712329
12/1/2008   54  11098.60822
7/1/2008    87  0
8/1/2008    87  0
9/1/2008    87  0
10/1/2008   87  18617.94155
11/1/2008   87  18017.36279
12/1/2008   87  18617.94155
1/1/2009    87  18617.94155
2/1/2009    87  16816.20527
7/1/2008    95  0
8/1/2008    95  8015.956284
9/1/2008    95  0
10/1/2008   95  8015.956284
11/1/2008   95  6309.447514
12/1/2008   95  6519.762431
1/1/2009    95  6519.762431
like image 775
user1100825 Avatar asked Dec 09 '12 09:12

user1100825


1 Answers

Would this help:

tapply(df$Sales, df$ID, function(a)head(which(a>0),1))

where df is your data frame above?

If you want the entire row & not just the index, this might help:

lapply(unique(df$ID),function(a) head(subset(df,ID==a & Sales>0),1))
like image 100
A_K Avatar answered Oct 14 '22 22:10

A_K