Using
StringComparer comparer1 = StringComparer.Ordinal;
instead of
IComparable v IComparable w comparer1.Compare(v, w)
Solved the runtime issue.
I've done some Benchmarks on sorting algorithms (e.g. Quicksort, Mergesort) in Java and C#.
I used Java 7 and the .NET Framework 4.5 for implementing and executing my algorithms. It showed that all Algorithms could achieve better runtimes using Java.
Some example runtimes for Quicksort:
C#
Java
Then I've done measures using profiling tools: C# used 75% of the runtime for String comparisons (i.e. CompareTo), whereas Java only used 2% of the runtime for comparisons.
Why are String comparisons so expensive in C# and not in Java?
EDIT: I've also tested the C# sort function Arrays.sort(INPUT) it could achieve about 3000 ms for n=1000000, so i don't think the code is the problem. But anyway:
Here's the code for Quicksort:
public class Quicksort { public static void sort(IComparable[] a) { sort(a, 0, a.Length - 1); } private static void sort(IComparable[] a, int lo, int hi) { if (hi <= lo) return; int j = partition(a, lo, hi); sort(a, lo, j-1); sort(a, j+1, hi); } private static int partition(IComparable[] a, int lo, int hi) { int i = lo; int j = hi + 1; IComparable v = a[lo]; while (true) { while (less(a[++i], v)) if (i == hi) break; while (less(v, a[--j])) if (j == lo) break; if (i >= j) break; exch(a, i, j); } exch(a, lo, j); return j; } public static IComparable select(IComparable[] a, int k) { if (k < 0 || k >= a.Length) { throw new Exception("Selected element out of bounds"); } Rnd.Shuffle(a); int lo = 0, hi = a.Length - 1; while (hi > lo) { int i = partition(a, lo, hi); if (i > k) hi = i - 1; else if (i < k) lo = i + 1; else return a[i]; } return a[lo]; } private static bool less(IComparable v, IComparable w) { return (v.CompareTo(w) < 0); } private static void exch(Object[] a, int i, int j) { Object swap = a[i]; a[i] = a[j]; a[j] = swap; } }
Quicksort is measured then as follows:
Stopwatch.Restart(); Quicksort.sort(stringArray); Stopwatch.Stop();
EDIT2: Somebody wanted to see the Java version. It's exactly the same i just use Comparable instead of IComparable and Array.length instead of Array.Length
The compareTo() method compares two strings lexicographically. The comparison is based on the Unicode value of each character in the strings. The method returns 0 if the string is equal to the other string.
BTW SSE makes string compare a LOT faster than one-character-at-a-time, but it can never reach the speed of an integer compare. ...whereas int comparisons can be done in significantly fewer instructions.
Using String. equals() :In Java, string equals() method compares the two given strings based on the data/content of the string. If all the contents of both the strings are same then it returns true. If any character does not match, then it returns false.
compareTo() method compares two Integer objects numerically. This method returns the value 0 if this Integer is equal to the argument Integer, a value less than 0 if this Integer is numerically less than the argument Integer and a value greater than 0 if this Integer is numerically greater than the argument Integer.
so i don't think the code is the problem
I beg to differ. You're not comparing the same things in both cases. Admittedly it doesn't help that you also haven't shown us how you generate the strings etc, but Java and .NET perform different default string comparisons.
From Java's String.compareTo
:
Compares two strings lexicographically.
And from .NET's String.CompareTo
:
This method performs a word (case-sensitive and culture-sensitive) comparison using the current culture.
It's no surprise that this is slower than doing a purely lexicographic comparison.
It's easy to see this when just comparing .NET with itself...
using System; using System.Diagnostics; using System.Text; class Test { static void Main() { string[] source = GenerateRandomStrings(); string[] workingSpace = new string[source.Length]; CopyAndSort(source, workingSpace); Stopwatch sw = Stopwatch.StartNew(); for (int i = 0; i < 1000; i++) { CopyAndSort(source, workingSpace); } sw.Stop(); Console.WriteLine("Elapsed time: {0}ms", (long) sw.Elapsed.TotalMilliseconds); } static string[] GenerateRandomStrings() { Random rng = new Random(); string[] ret = new string[10000]; for (int i = 0; i < ret.Length; i++) { ret[i] = GenerateRandomString(rng); } return ret; } static string GenerateRandomString(Random rng) { char[] chars = new char[30]; for (int i = 0; i < chars.Length; i++) { chars[i] = (char) rng.Next('A', 'z' + 1); } return new string(chars); } static void CopyAndSort(string[] original, string[] workingSpace) { Array.Copy(original, 0, workingSpace, 0, original.Length); Array.Sort(workingSpace); // Array.Sort(workingSpace, StringComparer.Ordinal); } }
Try it yourself, varying the CopyAndSort
method based on whether you specify an ordinal string comparer or not. It's much faster with the ordinal comparer, at least on my box.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With