Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Amortized complexity in layman's terms?

Can someone explain amortized complexity in layman's terms? I've been having a hard time finding a precise definition online and I don't know how it entirely relates to the analysis of algorithms. Anything useful, even if externally referenced, would be highly appreciated.

like image 673
Bob John Avatar asked Feb 26 '13 00:02

Bob John


People also ask

What do you mean by amortized analysis explain?

In computer science, amortized analysis is a method for analyzing a given algorithm's complexity, or how much of a resource, especially time or memory, it takes to execute. The motivation for amortized analysis is that looking at the worst-case run time can be too pessimistic.

How are amortized complexity and actual complexity related?

The amortized complexity of the method find is the same as its actual complexity, that is O(1) . Let us see how we can arrive at the amortized complexity of union using the accounting and potential function methods. for all u , where P(i) denotes the potential following the i th union operation.

Is amortized the same as average?

You can think of amortized a little like "average", but there's a subtle difference. Average involves a random process. Amortized does not. So, for example, suppose you have a system where (1) 1/2 the times you do an operation, it takes 1 second and (2) all other times it takes 10 seconds.


2 Answers

Amortized complexity is the total expense per operation, evaluated over a sequence of operations.

The idea is to guarantee the total expense of the entire sequence, while permitting individual operations to be much more expensive than the amortized cost.

Example:
The behavior of C++ std::vector<>. When push_back() increases the vector size above its pre-allocated value, it doubles the allocated length.

So a single push_back() may take O(N) time to execute (as the contents of the array are copied to the new memory allocation).

However, because the size of the allocation was doubled, the next N-1 calls to push_back() will each take O(1) time to execute. So, the total of N operations will still take O(N) time; thereby giving push_back() an amortized cost of O(1) per operation.


Unless otherwise specified, amortized complexity is an asymptotic worst-case guarantee for any sequence of operations. This means:

  • Just as with non-amortized complexity, the big-O notation used for amortized complexity ignores both fixed initial overhead and constant performance factors. So, for the purpose of evaluating big-O amortized performance, you can generally assume that any sequence of amortized operations will be "long enough" to amortize away a fixed startup expense. Specifically, for the std::vector<> example, this is why you don't need to worry about whether you will actually encounter N additional operations: the asymptotic nature of the analysis already assumes that you will.

  • Besides arbitrary length, amortized analysis doesn't make assumptions about the sequence of operations whose cost you are measuring -- it is a worst-case guarantee on any possible sequence of operations. No matter how badly the operations are chosen (say, by a malicious adversary!), an amortized analysis must guarantee that a long enough sequence of operations may not cost consistently more than the sum of their amortized costs. This is why (unless specifically mentioned as a qualifier) "probability" and "average case" are not relevant to amortized analysis -- any more than they are to an ordinary worst-case big-O analysis!

like image 111
comingstorm Avatar answered Sep 29 '22 19:09

comingstorm


In an amortized analysis, the time required to perform a sequence of data-structure operations is averaged over all the operations performed... Amortized analysis differs from average-case analysis in that probability is not involved; an amortized analysis guarantees the average performance of each operation in the worst case.

(from Cormen et al., "Introduction to Algorithms")

That might be a bit confusing since it says both that the time is averaged, and that it's not an average-case analysis. So let me try to explain this with a financial analogy (indeed, "amortized" is a word most commonly associated with banking and accounting.)

Suppose that you are operating a lottery. (Not buying a lottery ticket, which we'll get to in a moment, but operating the lottery itself.) You print 100,000 tickets, which you will sell for 1 currency unit each. One of those tickets will entitle the purchaser to 40,000 currency units.

Now, assuming you can sell all the tickets, you stand to earn 60,000 currency units: 100,000 currency units in sales, minus the 40,000 currency unit prize. For you, the value of each ticket is 0.60 currency units, amortized over all the tickets. This is a reliable value; you can bank on it. If you get tired of selling the tickets yourself, and someone comes along and offers to sell them for 0.30 currency units each, you know exactly where you stand.

For the lottery purchaser, the situation is different. The purchaser has an expected loss of 0.60 currency units when they purchase a lottery ticket. But that's probabilistic: the purchaser might buy ten lottery tickets every day for 30 years (a bit more than 100,000 tickets) without ever winning. Or they might spontaneously buy a single ticket one day, and win 39,999 currency units.

Applied to datastructure analysis, we're talking about the first case, where we amortize the cost of some datastructure operation (say, insert) over all the operations of that kind. Average-case analysis deals with the expected value of a stochastic operation (say, search), where we cannot compute the total cost of all the operations, but we can provide a probabilistic analysis of the expected cost of a single one.

It's often stated that amortized analysis applies to the situation where a high-cost operation is rare, and that's often the case. But not always. Consider, for example, the so-called "banker's queue", which is a first-in-first-out (FIFO) queue, made out of two stacks. (It's a classic functional data-structure; you can build cheap LIFO stacks out of immutable single-linked nodes, but cheap FIFOs are not so obvious). The operations are implemented as follows:

put(x):  Push x on the right-hand stack. y=get(): If the left-hand stack is empty:            Pop each element off the right-hand stack and              push it onto the left-hand stack. This effectively              reverses the right-hand stack onto the left-hand stack.          Pop and return the top element of the left-hand stack. 

Now, I claim that the amortized cost of put and get is O(1), assuming that I start and end with an empty queue. The analysis is simple: I always put onto the right-hand stack, and get from the left-hand stack. So aside from the If clause, each put is a push, and each get is a pop, both of which are O(1). I don't know how many times I will execute the If clause -- it depends on the pattern of puts and gets -- but I know that every element moves exactly once from the right-hand stack to the left-hand stack. So the total cost over the entire sequence of n puts and n gets is: n pushes, n pops, and n moves, where a move is a pop followed by a push: in other words, the 2n operations (n puts and n gets) result in 2n pushes and 2n pops. So the amortized cost of a single put or get is one push and one pop.

Note that banker's queues are called that precisely because of the amortized complexity analysis (and the association of the word "amortized" with finance). Banker's queues are the answer to what used to be a common interview question, although I think it's now considered too well-known: Come up with a queue which implements the following three operations in amortized O(1) time:

1) Get and remove the oldest element of the queue,

2) Put a new element onto the queue,

3) Find the value of the current maximum element.

like image 44
rici Avatar answered Sep 29 '22 18:09

rici