This weekend I decided to try my hand at some Scala and Clojure. I'm proficient with object oriented programming, and so Scala was easy to pick up as a language, but wanted to try out functional programming. This is where it got hard.
I just can't seem to get my head into a mode of writing functions. As an expert functional programmer, how do you approach a problem?
Given a list of values and a defined period of summation, how would you generate a new list of the simple moving average of the list?
For example: Given the list values
(2.0, 4.0, 7.0, 6.0, 3.0, 8.0, 12.0, 9.0, 4.0, 1.0), and the period
4, the function should return: (0.0, 0.0, 0.0, 4.75, 5.0, 6.0, 7.25, 8.0, 8.25, 6.5)
After spending a day mulling it over, the best I could come up with in Scala was this:
def simpleMovingAverage(values: List[Double], period: Int): List[Double] = { (for (i <- 1 to values.length) yield if (i < period) 0.00 else values.slice(i - period, i).reduceLeft(_ + _) / period).toList }
I know this is horribly inefficient, I'd much rather do something like:
where n < period: ma(n) = 0 where n = period: ma(n) = sum(value(1) to value(n)) / period where n > period: man(n) = ma(n -1) - (value(n-period) / period) + (value(n) / period)
Now that would be easily done in a imperative style, but I can't for the life of me work out how to express that functionally.
Summary. A moving average is a technical indicator that investors and traders use to determine the trend direction of securities. It is calculated by adding up all the data points during a specific period and dividing the sum by the number of time periods.
Interesting problem. I can think of many solutions, with varying degrees of efficiency. Having to add stuff repeatedly isn't really a performance problem, but let's assume it is. Also, the zeroes at the beginning can be prepended later, so let's not worry about producing them. If the algorithm provides them naturally, fine; if not, we correct it later.
Starting with Scala 2.8, the following would give the result for n >= period
by using sliding
to get a sliding window of the List:
def simpleMovingAverage(values: List[Double], period: Int): List[Double] = List.fill(period - 1)(0.0) ::: (values sliding period map (_.sum) map (_ / period))
Nevertheless, although this is rather elegant, it doesn't have the best performance possible, because it doesn't take advantage of already computed additions. So, speaking of them, how can we get them?
Let's say we write this:
values sliding 2 map sum
We have a list of the sum of each two pairs. Let's try to use this result to compute the moving average of 4 elements. The above formula made the following computation:
from d1, d2, d3, d4, d5, d6, ... to (d1+d2), (d2+d3), (d3+d4), (d4+d5), (d5+d6), ...
So if we take each element and add it to the second next element, we get the moving average for 4 elements:
(d1+d2)+(d3+d4), (d2+d3)+(d4+d5), (d3+d4)+(d5+d6), ...
We may do it like this:
res zip (res drop 2) map Function.tupled(_+_)
We could then compute the moving average for 8 elements, and so on. Well, there is a well known algorithm to compute things that follow such pattern. It's most known for its use on computing the power of a number. It goes like this:
def power(n: Int, e: Int): Int = e match { case 0 => 1 case 1 => n case 2 => n * n case odd if odd % 2 == 1 => power(n, (odd - 1)) * n case even => power(power(n, even / 2), 2) }
So, let's apply it here:
def movingSum(values: List[Double], period: Int): List[Double] = period match { case 0 => throw new IllegalArgumentException case 1 => values case 2 => values sliding 2 map (_.sum) case odd if odd % 2 == 1 => values zip movingSum(values drop 1, (odd - 1)) map Function.tupled(_+_) case even => val half = even / 2 val partialResult = movingSum(values, half) partialResult zip (partialResult drop half) map Function.tupled(_+_) }
So, here's the logic. Period 0 is invalid, period 1 is equal to the input, period 2 is sliding window of size 2. If greater than that, it may be even or odd.
If odd, we add each element to the movingSum
of the next (odd - 1)
elements. For example, if 3, we add each element to the movingSum
of the next 2 elements.
If even, we compute the movingSum
for n / 2
, then add each element to the one n / 2
steps afterwards.
With that definition, we can then go back to the problem and do this:
def simpleMovingAverage(values: List[Double], period: Int): List[Double] = List.fill(period - 1)(0.0) ::: (movingSum(values, period) map (_ / period))
There's a slight inefficiency with regards to the use of :::
, but it's O(period), not O(values.size). It can be made more efficient with a tail recursive function. And, of course, the definition of "sliding" I provided is horrendous performance-wise, but there will be a much better definition of it on Scala 2.8. Note that we can't make an efficient sliding
method on a List
, but we can do it on an Iterable
.
Having said all that, I'd go with the very first definition, and optimize only if a critical path analysis pinpointed this as a big deal.
To conclude, let's consider how I went about the problem. We have a moving average problem. A moving average is the sum of a moving "window" on a list, divided by the size of that window. So, first, I try to get a sliding window, sum everything on it, and then divide by the size.
The next problem was to avoid repetition of already computed additions. In this case, I went to the smallest addition possible, and tried to figure out how to compute bigger sums reusing such results.
Finally, let's try to solve the problem the way you figured it, by adding and subtracting from the previous result. Getting the first average is easy:
def movingAverage(values: List[Double], period: Int): List[Double] = { val first = (values take period).sum / period
Now we make two lists. First, the list of elements to be subtracted. Next, the list of elements to be added:
val subtract = values map (_ / period) val add = subtract drop period
We can add these two lists by using zip
. This method will only produce as many elements as the smaller list has, which avoids the problem of subtract
being bigger than necessary:
val addAndSubtract = add zip subtract map Function.tupled(_ - _)
We finish by composing the result with a fold:
val res = (addAndSubtract.foldLeft(first :: List.fill(period - 1)(0.0)) { (acc, add) => (add + acc.head) :: acc }).reverse
which is the answer to be returned. The whole function looks like this:
def movingAverage(values: List[Double], period: Int): List[Double] = { val first = (values take period).sum / period val subtract = values map (_ / period) val add = subtract drop period val addAndSubtract = add zip subtract map Function.tupled(_ - _) val res = (addAndSubtract.foldLeft(first :: List.fill(period - 1)(0.0)) { (acc, add) => (add + acc.head) :: acc }).reverse res }
I know Clojure better than Scala, so here goes. As I write this the other Clojure entry here is imperative; that's not really what you're after (and isn't idiomatic Clojure). The first algorithm that comes to my mind is repeatedly taking the requested number of elements from the sequence, dropping the first element, and recurring.
The following works on any kind of sequence (vector or list, lazy or not) and gives a lazy sequence of averages---which could be helpful if you're working on a list of indefinite size. Note that it takes care of the base case by implicitly returning nil if there aren't enough elements in the list to consume.
(defn moving-average [values period] (let [first (take period values)] (if (= (count first) period) (lazy-seq (cons (/ (reduce + first) period) (moving-average (rest values) period))))))
Running this on your test data yields
user> (moving-average '(2.0, 4.0, 7.0, 6.0, 3.0, 8.0, 12.0, 9.0, 4.0, 1.0) 4) (4.75 5.0 6.0 7.25 8.0 8.25 6.5)
It doesn't give "0" for the first few elements in the sequence, though that could easily be handled (somewhat artificially).
The easiest thing of all is to see the pattern and be able to bring to mind an available function that fits the bill. partition
gives a lazy view of portions of a sequence, which we can then map over:
(defn moving-average [values period] (map #(/ (reduce + %) period) (partition period 1 values))
Someone asked for a tail recursive version; tail recursion vs. laziness is a bit of a tradeoff. When your job is building up a list then making your function tail recursive is usually pretty simple, and this is no exception---just build up the list as an argument to a subfunction. We'll accumulate to a vector instead of a list because otherwise the list will be built up backwards and will need to be reversed at the end.
(defn moving-average [values period] (loop [values values, period period, acc []] (let [first (take period values)] (if (= (count first) period) (recur (rest values) period (conj acc (/ (reduce + first) period))) acc))))
loop
is a way to make an anonymous inner function (sort of like Scheme's named let); recur
must be used in Clojure to eliminate tail calls. conj
is a generalized cons
, appending in the manner natural for the collection---the beginning of lists and the end of vectors.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With