Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best algorithm to sort exams

Tags:

sorting

I am a grader for a statistics course and have a series of paper homework assignments given to me in random order. Part of my job is to alphabetize them. I have been using a method similar to quick-sort, but other graders have used different methods. I want an efficient sorting method, with justification, for when I have a "large" number of exams, with justification provided.. Here are some specifics I have leveraged:

  • I have a roster which contains an alphabetized list of all the names I should see.
  • I don't care to get the names more alphabetized than just the first letter. For example, I am fine if "Smith, John" comes before "Salk, Jonas" .
  • I will never have to sort more than 300 objects.

My method thus far has been to find the median last letter (ie: if there are 60 papers, pick the last name letter corresponding to the 30th person) of the class roster, treat it as a pivot point, and put all letters above the median in one pile, and all the letters below in another. If a letter is the same as the median, I place it in the median pile. I now do the same thing on the above/below-median piles. When the piles are small enough that there will only be three or four letters in a stack, I make one stack for each letter, then fold the stacks into a master stack, alphabetically.

Are there any algorithms specifically designed for alphabetizing, or something that is more efficient on average than my method? One method that seemed to do okay was to make a stack for each letter (26 piles, worst case), but this consumes so much space that it is not feasible for one desk.

like image 658
Christopher Aden Avatar asked Mar 16 '12 16:03

Christopher Aden


People also ask

Which algorithm is used for sorting?

While there are a large number of sorting algorithms, in practical implementations a few algorithms predominate. Insertion sort is widely used for small data sets, while for large data sets an asymptotically efficient sort is used, primarily heapsort, merge sort, or quicksort.

Which is the easiest sorting algorithm?

Bubble Sort is the simplest sorting algorithm that works by repeatedly swapping the adjacent elements if they are in the wrong order.

Which sorting algorithm is best if the list is already sorted?

Insertion Sort If the data is nearly sorted or when the list is small as it has a complexity of O(N2) and if the list is sorted a minimum number of elements will slide over to insert the element at its correct location. This algorithm is stable and it has fast running case when the list is nearly sorted.


2 Answers

This is a great question! We conducted a small experiment to come closer to an answer.

Our set-up consisted of

  • 3 sorters (A, B and C).

  • 3 stacks of 40 student problem sets (one for each sorter). The number of sheets of a problem set ranged from 1 to 5. The sheets were stapled and had student names written on the top of the first page.

  • 3 sorting algorithms to sort the stacks alphabetically:

    • Insertion: Take top item from unsorted pile and insert into correct position in sorted pile. Fanning out the sorted pile is allowed.
    • Bucket: Sort each item into one of five buckets (A-E, F-J, K-O, P-T, U-Z). Then sort each bucket using insertion sort. Combine sorted buckets.
    • Merge: Divide items into 10 piles. Sort each pile using insertion sort. Put 10 sorted piles into 5 pairs. Merge each pair by repeatedly looking at the top items of the pair and putting the alphabetically higher one on the bottom of the pair's resulting pile. After merging 10 piles into 5, merge 2 of the 5 piles, so that there are 4 piles left. Then, repeatedly merge pairwise until a single sorted pile remains.
  • Measurements:

    • Time until completion of the sorting algorithm.
    • The number of misplaced items (measured by other sorter).
  • The order of sorting algorithms was randomized.

  • Every new round the problem set stacks were exchanged among sorters and shuffled for several minutes.

  • Sorters A and B each did 9 rounds, sorter C did 3 rounds.

  • A sheet with the alphabet and bucket sort cutoffs was put up on each sorter's table.

Here is a picture of our set-up.

Experimental set-up (including sorters A, B and C and beautiful sunset)

And here are the results.

Experimental results

Two conclusions are immediate.

  1. The relatively complex merge sort algorithm preformed poorly. Merge sorts consistently took 57 to 125% longer than within sorter bucket / insertion sort averages with no obvious accuracy gains.

We speculate that the initial step of first dividing the stack of problem sets into 10 piles may contribute to merge sort's lackluster results. Future researchers may find that merge-like algorithms combined with more efficient set-up procedures are effective.

  1. Although bucket and insertion sort both performed well, bucket sort was 13 to 25% faster than insertion sort within sorters. This difference corresponds to roughly a minute of time saved for each 40-problem-set sort.

We speculate that the relative efficiency of bucket sort would improve as the number of problem sets to sort grows beyond 40 and that insertion sort would dominate for stacks of 30 or fewer, although more testing is needed. There were no clear differences in accuracy between bucket and insertion sorts.

Lastly, we note that there are important individual differences in sorting ability among our test subjects. Sorter B consistently outperformed sorters A and C by on average 39 and 101 seconds respectively. This suggests that although the sorting procedure employed is important to sorting speed, ability may explain at least as large a share of the variance in individual outcomes. Exploring what makes Germans such fantastic sorters is a promising area for future research.

like image 136
robust Avatar answered Sep 17 '22 14:09

robust


I was looking around at some websites that were talking about algorithms for humans to use, and one that I saw was doing a sort of insertion sort, where you put the one at hand into the pile by putting it directly where it's correct ordering should be.

The inefficiency of this would probably be from having to scan through the pile to find the location as the pile gets bigger, so I'm thinking that to adjust for this, you can add a tag or something that would act as an index for a specific alphabetical location. Since you don't care about the alphabetical ordering aside from the first letter, this would basically put your insertion cost at O(1)

This is just a thought I had while thinking about it, so I have no actually tried it myself, and am unable to say with regards to how effective it would be with sufficiently large piles. But I think that it should work fairly well, as the tags would give you instant access to the location you want to insert.

like image 36
MysticXG Avatar answered Sep 17 '22 14:09

MysticXG