Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java mergesort, should the "merge" step be done with queues or arrays?

Tags:

java

algorithm

This is not homework, I don't have money for school so I am teaching myself whilst working shifts at a tollbooth on the highway (long nights with few customers)

I was trying to implement a simple "mergesort" by thinking first, stretching my brain a little if you like for some actual learning, and then looking at the solution on the manual I am using: "2008-08-21 | The Algorithm Design Manual | Springer | by Steven S. Skiena | ISBN-1848000693".

I came up with a solution which implements the "merge" step using an array as a buffer, I am pasting it below. The author uses queues so I wonder:

  • Should queues be used instead?
  • What are the advantages of one method Vs the other? (obviously his method will be better as he is a top algorist and I am a beginner, but I can't quite pinpoint the strengths of it, help me please)
  • What are the tradeoffs/assumptions that governed his choice?

Here is my code (I am including my implementation of the splitting function as well for the sake of completeness but I think we are only reviewing the merge step here; I do not believe this is a Code Review post by the way as my questions are specific to just one method and about its performance in comparison to another):

package exercises;
public class MergeSort {
  private static void merge(int[] values, int leftStart, int midPoint,
      int rightEnd) {
    int intervalSize = rightEnd - leftStart;
    int[] mergeSpace = new int[intervalSize];
    int nowMerging = 0;
    int pointLeft = leftStart;
    int pointRight = midPoint;
    do {
      if (values[pointLeft] <= values[pointRight]) {
        mergeSpace[nowMerging] = values[pointLeft];
        pointLeft++;
      } else {
        mergeSpace[nowMerging] = values[pointRight];
        pointRight++;
      }
      nowMerging++;
    } while (pointLeft < midPoint && pointRight < rightEnd);
    int fillFromPoint = pointLeft < midPoint ? pointLeft : pointRight;
    System.arraycopy(values, fillFromPoint, mergeSpace, nowMerging,
        intervalSize - nowMerging);
    System.arraycopy(mergeSpace, 0, values, leftStart, intervalSize);
  }
  public static void mergeSort(int[] values) {
    mergeSort(values, 0, values.length);
  }
  private static void mergeSort(int[] values, int start, int end) {
    int intervalSize = end - start;
    if (intervalSize < 2) {
      return;
    }
    boolean isIntervalSizeEven = intervalSize % 2 == 0;
    int splittingAdjustment = isIntervalSizeEven ? 0 : 1;
    int halfSize = intervalSize / 2;
    int leftStart = start;
    int rightEnd = end;
    int midPoint = start + halfSize + splittingAdjustment;
    mergeSort(values, leftStart, midPoint);
    mergeSort(values, midPoint, rightEnd);
    merge(values, leftStart, midPoint, rightEnd);
  }
}

Here is the reference solution from the textbook: (it's in C so I am adding the tag)

merge(item_type s[], int low, int middle, int high)
{
  int i; /* counter */
  queue buffer1, buffer2; /* buffers to hold elements for merging */
  init_queue(&buffer1);
  init_queue(&buffer2);
  for (i=low; i<=middle; i++) enqueue(&buffer1,s[i]);
  for (i=middle+1; i<=high; i++) enqueue(&buffer2,s[i]);
  i = low;
  while (!(empty_queue(&buffer1) || empty_queue(&buffer2))) {
    if (headq(&buffer1) <= headq(&buffer2))
      s[i++] = dequeue(&buffer1);
    else
      s[i++] = dequeue(&buffer2);
  }
  while (!empty_queue(&buffer1)) s[i++] = dequeue(&buffer1);
  while (!empty_queue(&buffer2)) s[i++] = dequeue(&buffer2);
}
like image 571
Robottinosino Avatar asked Aug 21 '12 20:08

Robottinosino


1 Answers

Abstractly, a queue is just some object that supports the enqueue, dequeue, peek, and is-empty operations. It can be implemented in many different ways (using a circular buffer, using linked lists, etc.)

Logically speaking, the merge algorithm is easiest to describe in terms of queues. You begin with two queues holding the values to merge together, then repeatedly apply peek, is-empty, and dequeue operations on those queues to reconstruct a single sorted sequence.

In your implementation using arrays, you are effectively doing the same thing as if you were using queues. You have just chosen to implement those queues using arrays. There isn't necessarily "better" or "worse" than using queues. Using queues makes the high-level operation of the merge algorithm clearer, but might introduce some inefficiency (though it's hard to say for certain without benchmarking). Using arrays might be slightly more efficient (again, you should test this!), but might obscure the high-level operation of the algorithm. From Skienna's point of view, using queues might be better because it makes the high-level details of the algorithm clear. From your point of view, arrays might be better because of the performance concerns.

Hope this helps!

like image 168
templatetypedef Avatar answered Oct 23 '22 14:10

templatetypedef