Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding why Floyd's tortoise and hare algorithm works when applied to an array of integers

I was trying to solve this leetcode problem https://leetcode.com/problems/find-the-duplicate-number/ using my own implementation of the tortoise and hare algorithm which resulted in an infinite loop when given the following array of integers:

[3,1,3,4,2]

Only after tracing through my algorithm was I able to see that the slow and fast runners never take on the two duplicate values at the same time. Here is my algorithm in pseudocode:

initialize fast and slow runners to 0

while(true)

   move fast runner two indices forward
   move slow runner one index forward

   if arr[fast] == arr[slow] and fast != slow
      return arr[fast] // this is the duplicate

Now, I'm sure someone who is skilled in discrete mathematics would have been able to intuitively know that this approach would not have lead to the correct solution without first having to trace through an example like I had to do.

What inferences or observations could I have made that would have lead me to see that this algorithm was not going to work? I'd like to know how one could intuitively identity a flaw in this logic through a series of logical statements. In other words, what's the explanation for why the two runners will never find the duplicates in this example? I feel like it may have something to do with counting, but I do not have a very strong background in discrete.

And to clarify, I have looked at the correct implementation so I do know what the correct way to solve it is. I just thought that this way would have worked too similar to applying it to linked lists, where you'd move the fast runner two nodes up and the slow runner one node up. Thank you for your help.

like image 866
krabinowitz Avatar asked Oct 27 '20 19:10

krabinowitz


People also ask

Why does Floyd's tortoise and hare algorithm work?

Floyd's cycle finding algorithm or Hare-Tortoise algorithm is a pointer algorithm that uses only two pointers, moving through the sequence at different speeds. This algorithm is used to find a loop in a linked list. It uses two pointers one moving twice as fast as the other one.

What is the time complexity of Floyd's cycle finding algorithm?

According to some online sources I referred the runtime complexity of Floyd's cycle detection algo is O(n).


1 Answers

Floyd's tortoise algorithm works when you're detecting a cycle in a linked list. It relies on the fact that if both pointers are moving at a different pace, the gap between them will keep on increasing to a limit, after which it'll be reset if a cycle exists.
In this case, the algorithm does find a cycle, since both pointers converge to the index 0 after some iterations. However, you're not looking to detect a cycle here; you're trying to find a duplicate. That's why this gets stuck in infinite recursion: it is meant to detect a cycle (which it correctly does), but not detect duplicates in its basic implementation.

To clarify, here's a sample linked list created on your sample array.

3 -> 1 -> 3 -> 4 -> 2
'--<----<----<----<-'

If you run Floyd's algorithm, you find that the cycle will get detected at index 0, since both pointers will converge there. It works by checking if fast and slow point to the same location and not if they have the same values of nodes (fast==slow isn't the same as fast.value==slow.value).

You are attempting to check duplicates by comparing the value on the nodes, and checking if the nodes don't point to the same location. That is actually the flaw, since Floyd's algorithm works to check if both pointers point to the same location in order to detect a cycle.
You can read this simple, informative proof to improve your intuition as to why the pointers will converge.

like image 185
Abhinav Mathur Avatar answered Sep 25 '22 12:09

Abhinav Mathur