Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How could a linked list achieve O(n log n) sorting time?

Tags:

I'm curious, in the first place, why std::list and std::forward_list include sorting functions as member functions, unlike every other standard library container. But what's more interesting to me is that both CPPReference and CPlusPlus claim that this sorting is done in O(n log n) time.

I can't even imagine how one would sort a container without random access to elements. So I threw together a test, using forward_list to make it as difficult as possible.

#include <chrono> #include <cstdint> #include <deque> #include <forward_list> #include <iostream> #include <random>  using std::endl; using namespace std::chrono;  typedef nanoseconds::rep length_of_time; constexpr int TEST_SIZE = 25000;   class Stopwatch {     public:         void start_timing();         void end_timing();         length_of_time get_elapsed_time() const;     private:         time_point<high_resolution_clock> start;         time_point<high_resolution_clock> end;         length_of_time elapsed_time = 0; };   void Stopwatch::start_timing() {     start = high_resolution_clock::now(); }   void Stopwatch::end_timing() {     end = high_resolution_clock::now();     auto elapsed = end - start;     auto elapsed_nanoseconds = duration_cast<nanoseconds>(elapsed);     elapsed_time = elapsed_nanoseconds.count(); }   length_of_time Stopwatch::get_elapsed_time() const {     return elapsed_time; }   std::mt19937_64 make_random_generator() {     using namespace std::chrono;     auto random_generator = std::mt19937_64();     auto current_time = high_resolution_clock::now();     auto nanos = duration_cast<nanoseconds>(             current_time.time_since_epoch()).count();     random_generator.seed(nanos);     return random_generator; }   int main() {     Stopwatch timer;     std::deque<length_of_time> times;     auto generator = make_random_generator();     for (int i = 1; i <= TEST_SIZE; i++) {         std::forward_list<uint64_t> container;         for (int j = 1; j <= i; j++) {             container.push_front(generator());         }         timer.start_timing();         container.sort();         timer.end_timing();         times.push_back(timer.get_elapsed_time());         container.clear();     }      for (const auto& time: times) {         std::cout << time << endl;     } } 

The numbers this program output gave the following graph:

forward list sorting time

Which does indeed look like O(n log n) growth (although the spikes at each third of the way across are interesting). How does the library do this? Perhaps copy to a container that supports sorting, sort that, and copy back?

like image 208
EMBLEM Avatar asked Apr 28 '15 03:04

EMBLEM


People also ask

What is the time complexity of sorting a linked list?

The slow random-access performance of a linked list makes other algorithms (such as quicksort) perform poorly, and others (such as heapsort) completely impossible. Since worst case time complexity of Merge Sort is O(nLogn) and Insertion sort is O(n^2), merge sort is preferred.

Can linked list be used for sorting?

Merge sort is often preferred for sorting a linked list. The slow random-access performance of a linked list makes some other algorithms (such as quicksort) perform poorly, and others (such as heapsort) completely impossible.

Which sorting method is best for linked list?

Generally speaking, merge sort is best suited for linked lists. This is due to the nature of the algorithm requiring less random access of memory. Quicksort can be fast but unreliable. Quicksort for arrays is a better option than for linked lists; the lookup times of arrays are faster than for linked lists.

What is the time complexity of inserting a node in a linked list?

The task is to insert the given elements at the middle position in the linked list one after another. Each insert operation should take O(1) time complexity.


2 Answers

Linked lists can be sorted in O(n log n) using Mergesort.

Interestingly, since linked lists already have the appropriate structure, sorting a linked list with Mergesort only requires O(1) extra space.

The fact that this requires a specialized algorithm specifically tuned for the list structure is also the reason sort is a member function of the list, rather than a separate function.


As for how it works - all you need is the merge operation. The merge operation takes two lists. You look at the heads of both lists, and remove the smallest head and append it to your result list. You keep doing this until all heads have been merged into the big list - done.

Here's a sample merge operation in C++:

struct Node {     Node* next;     int val; };  Node* merge(Node* a, Node* b) {     Node fake_head(nullptr, 0);      Node* cur = &fake_head;     while (a && b) {         if (a->val < b->val) { cur->next = a; a = a->next; }         else                 { cur->next = b; b = b->next; }         cur = cur->next;     }      cur->next = a ? a : b;     return fake_head.next; } 
like image 135
orlp Avatar answered Sep 28 '22 03:09

orlp


Example code of a bottom up merge sort using an array of pointers to lists where array[i] points to a list of size 2^i (except last pointer points to a list of unlimited size). This is how the HP / Microsoft standard template library implements std::list::sort.

#define NUMLISTS 32                     /* number of lists */  typedef struct NODE_{ struct NODE_ * next; int data;                               /* could be any comparable type */ }NODE;  NODE * MergeLists(NODE *, NODE *);  NODE * SortList(NODE *pList) { NODE * aList[NUMLISTS];                 /* array of lists */ NODE * pNode; NODE * pNext; int i;     if(pList == NULL)                   /* check for empty list */         return NULL;     for(i = 0; i < NUMLISTS; i++)       /* zero array */         aList[i] = NULL;     pNode = pList;                      /* merge nodes into aList[] */     while(pNode != NULL){         pNext = pNode->next;         pNode->next = NULL;         for(i = 0; (i < NUMLISTS) && (aList[i] != NULL); i++){             pNode = MergeLists(aList[i], pNode);             aList[i] = NULL;         }         if(i == NUMLISTS)             i--;         aList[i] = pNode;         pNode = pNext;     }     pNode = NULL;                       /* merge array into one list */     for(i = 0; i < NUMLISTS; i++)         pNode = MergeLists(aList[i], pNode);     return pNode; }  NODE * MergeLists(NODE *pSrc1, NODE *pSrc2) { NODE *pDst = NULL;                      /* destination head ptr */ NODE **ppDst = &pDst;                   /* ptr to head or prev->next */     while(1){         if(pSrc1 == NULL){             *ppDst = pSrc2;             break;         }         if(pSrc2 == NULL){             *ppDst = pSrc1;             break;         }         if(pSrc2->data < pSrc1->data){  /* if src2 < src1 */             *ppDst = pSrc2;             pSrc2 = *(ppDst = &(pSrc2->next));             continue;         } else {                        /* src1 <= src2 */             *ppDst = pSrc1;             pSrc1 = *(ppDst = &(pSrc1->next));             continue;         }     }     return pDst; } 

Another but slower way to merge sort a list is similar to a 4 tape sort (all sequential access). The initial list is split into two lists. Each list is considered to be a stream of runs, where the initial run size is 1. In this example, counters are used to keep track of run boundaries so it's a bit more complicated and slower than the array of pointers method. Runs from two input lists are merged, alternating between two output lists. After each merge pass, the run size is doubled, the direction of the merge is changed so what were output lists become input lists and vice versa. The sort is done when all runs end up on just one of the output lists. If stability is not required, then run boundaries could be defined as any node followed by an out of order node, and this would take advantage of natural ordering with the original list.

NODE * SortList(NODE * pList) { NODE *pSrc0; NODE *pSrc1; NODE *pDst0; NODE *pDst1; NODE **ppDst0; NODE **ppDst1; int cnt;      if(pList == NULL)                   /* check for null ptr */         return NULL;     if(pList->next == NULL)             /* if only one node return it */         return pList;     pDst0 = NULL;                       /* split list */     pDst1 = NULL;     ppDst0 = &pDst0;     ppDst1 = &pDst1;     while(1){         *ppDst0 = pList;         pList = *(ppDst0 = &pList->next);         if(pList == NULL)             break;         *ppDst1 = pList;         pList = *(ppDst1 = &pList->next);         if(pList == NULL)             break;     }     *ppDst0 = NULL;     *ppDst1 = NULL;     cnt = 1;                            /* init run size */     while(1){         pSrc0 = pDst0;                  /* swap merge direction */         pSrc1 = pDst1;         pDst0 = NULL;         pDst1 = NULL;         ppDst0 = &pDst0;         ppDst1 = &pDst1;         while(1){                       /* merge a set of runs */             if(MergeRuns(&ppDst0, &pSrc0, &pSrc1, cnt))                 break;             if(MergeRuns(&ppDst1, &pSrc0, &pSrc1, cnt))                 break;         }         cnt <<= 1;                      /* bump run size */         if(pDst1 == NULL)               /* break if done */             break;     }     return pDst0; }         int MergeRuns(NODE ***pppDst, NODE **ppSrc0, NODE **ppSrc1, int cnt) { NODE **ppDst = *pppDst; NODE *pSrc0  = *ppSrc0; NODE *pSrc1  = *ppSrc1; int cnt0, cnt1;      cnt0 = cnt;     cnt1 = cnt;     if(pSrc0 == NULL){                      /* if end data src0 */         *ppDst = NULL;         *pppDst = ppDst;         return(1);     }     if(pSrc1 == NULL){                      /* if end data src1 */         do{                                 /*   copy rest of src0 */             *ppDst = pSrc0;             pSrc0 = *(ppDst = &(pSrc0->next));         }while(pSrc0);         *ppDst = NULL;         *pppDst = ppDst;         return(1);     }     while(1){         if(pSrc1->data < pSrc0->data){      /* if src1 < src0 */             *ppDst = pSrc1;                 /*  move src1 */             pSrc1 = *(ppDst = &(pSrc1->next));             if(pSrc1 != NULL && --cnt1)     /*  if not end run1, continue */                 continue;             do{                             /*    copy run0 */                 *ppDst = pSrc0;                 pSrc0 = *(ppDst = &(pSrc0->next));             }while(pSrc0 != NULL && --cnt0);             break;         } else {                            /* else src0 <= src1 */             *ppDst = pSrc0;                 /*  move src0 */             pSrc0 = *(ppDst = &(pSrc0->next));             if(pSrc0 != NULL && --cnt0)     /*  if not end run0, continue */                 continue;             do{                             /*    copy run1 */                 *ppDst = pSrc1;                 pSrc1 = *(ppDst = &(pSrc1->next));             }while(pSrc1 != NULL && --cnt1);             break;         }     }     *ppSrc0 = pSrc0;                        /* update ptrs, return */     *ppSrc1 = pSrc1;     *ppDst  = NULL;     *pppDst = ppDst;     return(0); } 
like image 23
rcgldr Avatar answered Sep 28 '22 01:09

rcgldr