Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is an appropriate sort algorithm for an embedded system?

I'm developing the software for an embedded system, and I need to implement a sorting routine, and I'm having trouble choosing an optimal solution. My requirements are as follows:

  1. Because this is a very memory-limited system, space complexity is a primary factor.
  2. Because the number of elements to sort will generally be small, and the sorting will happen only occasionally, time complexity is not necessarily a primary factor.
  3. A stable algorithm is a requirement for my application.
  4. Because this is an embedded system, code size is a factor.
  5. There is no guarantee that the data will initially be in a nearly-sorted order.

I've considered the following algorithms:

  • bubble sort (yes, even though I'm ashamed to say it)
  • gnome sort
  • insertion sort
  • in-place merge sort (though it seems to me that this is more ideal with linked lists than arrays?)

While the answer (for my exact circumstances) may very well be, "uh, duh, it doesn't really matter, use bubble sort for all we care", that answer is not very useful. In general, what sort algorithms are useful on embedded systems?

like image 745
Mark Avatar asked Sep 09 '11 05:09

Mark


People also ask

What is the best sorting algorithm to use?

Quicksort. Quicksort is one of the most efficient sorting algorithms, and this makes of it one of the most used as well. The first thing to do is to select a pivot number, this number will separate the data, on its left are the numbers smaller than it and the greater numbers on the right.

What is algorithm in embedded system?

An embedded system is a computer without a screen, keyboard or mouse. An algorithm is “a way of doing” something. Any program, any complete piece of a program, is an algorithm. One has nothing to do with the other, really. (And there's no “list of algorithms” - if you write a program, that's an algorithm.

Is Shell sort used in embedded systems?

For many embedded applications, a shell sort has a lot to commend it – decent code size, fast running time, well behaved and a clean implementation. Thus if you don't want to bother with this sort of investigation every time you need to sort an array, then a shell sort should be your starting point.


3 Answers

Insertion sort is fine too, it runs fast in practice, is stable and is in-place. It's very much related to gnome sort, faster in practice but for gnome sort the code is a bit smaller and it takes less auxiliary space (not in terms of big O, the difference is in the constant)

edit: yea I messed up a bit and got them reversed - I probably shouldn't answer questions before I've had my coffee.. it previously said that insertion sort has less code and space than gnome sort

like image 175
harold Avatar answered Oct 26 '22 17:10

harold


Don't be ashamed of bubble sort, it has its place. If your data set is of a small size, it's easy to code up and is stable if you do it right (don't ever swap equal elements).

It can also be blindingly fast if your data is mostly sorted, by alternating directions on each pass. I know you said it wasn't initially near-sorted, I'm talking about the possibility that it becomes so if you sort then persist. In either case, if the data set size is small, it doesn't really matter if it's totally unsorted.

This is especially the case if, as you mention in a comment on another answer, your data set size is around 11. Short of a sort algorithm explicitly designed to be intentionally horrendous, any algorithm will easily handle such a size fast enough.

If your environment doesn't offer a stable sort, I'd just go with the bubble sort, given your constraints and properties.


In fact, using the following program along with the time utility, I found that the CPU time used for qsort and bubble sort only start making a difference once the element count got up to 10,000.

And, even then, the bubble sort took less than half a second. Unless you're going to be doing many sorts per second, that will be irrelevant.

#include <stdio.h>
#include <stdlib.h>

static int data[10000];
#define SZDATA (sizeof (*data))
#define NMDATA (sizeof (data) / sizeof (*data))

int compfn (const void *a, const void *b) {
    if (*((int*)a) > *((int*)b))
        return 1;
    if (*((int*)a) < *((int*)b))
        return -1;
    return 0;
}

int main (void) {
    int i, tmp, swapped, count;

    for (i = 0; i < NMDATA; i++)
        data[i] = (i * 3) % 11;

    if (0) {
        qsort (data, NMDATA, SZDATA, compfn);
    } else {
        swapped = 1;
        count = NMDATA;
        while (swapped) {
            swapped = 0;
            for (i = 1; i < count; i++) {
                if (data[i] < data[i-1]) {
                    tmp = data[i];
                    data[i] = data[i-1];
                    data[i-1] = tmp;
                    swapped = 1;
                }
            }
            count --;
        }
    }

    //for (i = 0; i < NMDATA; i++)
        //printf ("%10d\n", data[i]);

    return 0;
}

By varying the size of the data array and the if (0) statement, I got the following results (in milliseconds), five runs for each case:

Data size | qsort | bubble
----------+-------+-------
      100 |    61 |     76
          |    76 |     76
          |    77 |     61
          |    61 |     60
          |    61 |     61  avg qsort = 67, bubble = 67

     1000 |    77 |     93
          |    61 |     45
          |    76 |     77
          |    77 |     76
          |    76 |     77  avg qsort = 73, bubble = 74
          |       |
    10000 |    92 |    414
          |    77 |    413
          |    61 |    413
          |    76 |    405
          |    61 |    421  avg qsort = 73, bubble = 413

I suspect the faster bubble sorts with a small data set are so because of the lack of function calls.

The thing to take away from this is that, for smaller data sets, the efficiency of the algorithm is often unimportant since things like big-O are usually relevant as the data sets become larger.

However, this test was done in my environment and yours may vary considerably. I would suggest performing the same measurements in your environment - implement a bubble sort and only consider moving to a more complex algorithm if and when it becomes necessary.


At the suggestion of a commenter, I re-ran the tests using srand(42) followed by rand() to populate the array elements. In that case, the results were only slightly worse for the bubble sort, 642 vs 413 milliseconds for 10,000 elements and 82 vs 74 milliseconds for 1,000.

Given the constraints in the question (small element count, infrequent sorting, stability requirement, low space complexity), I still prefer the simplicity of bubble sort to any of the more complex algorithms.

Still, remember my earlier advice: you need to time this in your own environment. The results may be vastly different. You can use the code I've provided as a baseline for doing that. Seriously, if whatever method you choose takes less than a second, it's probably more than adequate for your purposes.

like image 26
paxdiablo Avatar answered Oct 26 '22 17:10

paxdiablo


The fact that a system is embedded or not does not really matter. What matters is the very factors you list: code size constraints, available ram, speed needed, and number of elements.

With what you have outlined, a bubble sort is a perfectly acceptable solution: it is small, predictable, easy on memory, and very easy to implement and debug. I saw a proof in the early 1980s concluding that a bubble sort is time optimal for n ≤ 11. It's possible that modern quicksorts alter that slightly, but I doubt the break even could be reduced by much.

To make an unstable sort algorithm stable, add a sort key containing the original position. Refer to the secondary key only if the primary is a tie.

like image 28
wallyk Avatar answered Oct 26 '22 16:10

wallyk