Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference between asfreq and resample

Tags:

python

pandas

Can some please explain the difference between the asfreq and resample methods in pandas? When should one use what?

like image 498
psykeedelik Avatar asked Aug 05 '13 14:08

psykeedelik


People also ask

What is resample used for?

resample() function is primarily used for time series data. A time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time.

How do you use Asfreq?

The asfreq() function is used to convert TimeSeries to specified frequency. Optionally provide filling method to pad/backfill missing values. Returns the original data conformed to a new index with the specified frequency.

What is resample in DataFrame?

The resample() function is used to resample time-series data. Convenience method for frequency conversion and resampling of time series. Object must have a datetime-like index (DatetimeIndex, PeriodIndex, or TimedeltaIndex), or pass datetime-like values to the on or level keyword.

What does resampling do in Python?

As previously mentioned, resample() is a method of pandas dataframes that can be used to summarize data by date or time. The . sum() method will add up all values for each resampling period (e.g. for each day) to provide a summary output value for that period.


1 Answers

resample is more general than asfreq. For example, using resample I can pass an arbitrary function to perform binning over a Series or DataFrame object in bins of arbitrary size. asfreq is a concise way of changing the frequency of a DatetimeIndex object. It also provides padding functionality.

As the pandas documentation says, asfreq is a thin wrapper around a call to date_range + a call to reindex. See here for an example.

An example of resample that I use in my daily work is computing the number of spikes of a neuron in 1 second bins by resampling a large boolean array where True means "spike" and False means "no spike". I can do that as easy as large_bool.resample('S', how='sum'). Kind of neat!

asfreq can be used when you want to change a DatetimeIndex to have a different frequency while retaining the same values at the current index.

Here's an example where they are equivalent:

In [6]: dr = date_range('1/1/2010', periods=3, freq=3 * datetools.bday)  In [7]: raw = randn(3)  In [8]: ts = Series(raw, index=dr)  In [9]: ts Out[9]: 2010-01-01   -1.948 2010-01-06    0.112 2010-01-11   -0.117 Freq: 3B, dtype: float64  In [10]: ts.asfreq(datetools.BDay()) Out[10]: 2010-01-01   -1.948 2010-01-04      NaN 2010-01-05      NaN 2010-01-06    0.112 2010-01-07      NaN 2010-01-08      NaN 2010-01-11   -0.117 Freq: B, dtype: float64  In [11]: ts.resample(datetools.BDay()) Out[11]: 2010-01-01   -1.948 2010-01-04      NaN 2010-01-05      NaN 2010-01-06    0.112 2010-01-07      NaN 2010-01-08      NaN 2010-01-11   -0.117 Freq: B, dtype: float64 

As far as when to use either: it depends on the problem you have in mind...care to share?

like image 183
Phillip Cloud Avatar answered Oct 11 '22 14:10

Phillip Cloud