Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why are String comparisons (CompareTo) faster in Java than in C#?

Tags:

EDIT3:

Using

StringComparer comparer1 = StringComparer.Ordinal; 

instead of

IComparable v IComparable w comparer1.Compare(v, w) 

Solved the runtime issue.


I've done some Benchmarks on sorting algorithms (e.g. Quicksort, Mergesort) in Java and C#.

I used Java 7 and the .NET Framework 4.5 for implementing and executing my algorithms. It showed that all Algorithms could achieve better runtimes using Java.

Some example runtimes for Quicksort:

C#

  1. n=1000000 4433 ms
  2. n=2000000 10047 ms

Java

  1. n=1000000 1311 ms
  2. n=2000000 3164 ms

Then I've done measures using profiling tools: C# used 75% of the runtime for String comparisons (i.e. CompareTo), whereas Java only used 2% of the runtime for comparisons.

Why are String comparisons so expensive in C# and not in Java?

EDIT: I've also tested the C# sort function Arrays.sort(INPUT) it could achieve about 3000 ms for n=1000000, so i don't think the code is the problem. But anyway:

Here's the code for Quicksort:

public class Quicksort {  public static void sort(IComparable[] a) {     sort(a, 0, a.Length - 1); }  private static void sort(IComparable[] a, int lo, int hi) {      if (hi <= lo) return;     int j = partition(a, lo, hi);     sort(a, lo, j-1);     sort(a, j+1, hi); }  private static int partition(IComparable[] a, int lo, int hi) {     int i = lo;     int j = hi + 1;     IComparable v = a[lo];     while (true) {           while (less(a[++i], v))             if (i == hi) break;          while (less(v, a[--j]))             if (j == lo) break;            if (i >= j) break;          exch(a, i, j);     }      exch(a, lo, j);      return j; }   public static IComparable select(IComparable[] a, int k) {     if (k < 0 || k >= a.Length) {         throw new Exception("Selected element out of bounds");     }     Rnd.Shuffle(a);     int lo = 0, hi = a.Length - 1;     while (hi > lo) {         int i = partition(a, lo, hi);         if      (i > k) hi = i - 1;         else if (i < k) lo = i + 1;         else return a[i];     }     return a[lo]; }   private static bool less(IComparable v, IComparable w) {     return (v.CompareTo(w) < 0); }      private static void exch(Object[] a, int i, int j) {     Object swap = a[i];     a[i] = a[j];     a[j] = swap; }  } 

Quicksort is measured then as follows:

Stopwatch.Restart(); Quicksort.sort(stringArray); Stopwatch.Stop(); 

EDIT2: Somebody wanted to see the Java version. It's exactly the same i just use Comparable instead of IComparable and Array.length instead of Array.Length

like image 783
Mad A. Avatar asked Mar 05 '13 00:03

Mad A.


People also ask

Why we use compareTo method in Java?

The compareTo() method compares two strings lexicographically. The comparison is based on the Unicode value of each character in the strings. The method returns 0 if the string is equal to the other string.

Is string comparison faster than integer comparison?

BTW SSE makes string compare a LOT faster than one-character-at-a-time, but it can never reach the speed of an integer compare. ...whereas int comparisons can be done in significantly fewer instructions.

What happens when you compare strings in Java?

Using String. equals() :In Java, string equals() method compares the two given strings based on the data/content of the string. If all the contents of both the strings are same then it returns true. If any character does not match, then it returns false.

How does compareTo work with numbers in Java?

compareTo() method compares two Integer objects numerically. This method returns the value 0 if this Integer is equal to the argument Integer, a value less than 0 if this Integer is numerically less than the argument Integer and a value greater than 0 if this Integer is numerically greater than the argument Integer.


1 Answers

so i don't think the code is the problem

I beg to differ. You're not comparing the same things in both cases. Admittedly it doesn't help that you also haven't shown us how you generate the strings etc, but Java and .NET perform different default string comparisons.

From Java's String.compareTo:

Compares two strings lexicographically.

And from .NET's String.CompareTo:

This method performs a word (case-sensitive and culture-sensitive) comparison using the current culture.

It's no surprise that this is slower than doing a purely lexicographic comparison.

It's easy to see this when just comparing .NET with itself...

using System; using System.Diagnostics; using System.Text;  class Test {     static void Main()     {         string[] source = GenerateRandomStrings();         string[] workingSpace = new string[source.Length];          CopyAndSort(source, workingSpace);         Stopwatch sw = Stopwatch.StartNew();         for (int i = 0; i < 1000; i++)         {             CopyAndSort(source, workingSpace);         }         sw.Stop();         Console.WriteLine("Elapsed time: {0}ms",                            (long) sw.Elapsed.TotalMilliseconds);     }      static string[] GenerateRandomStrings()     {         Random rng = new Random();         string[] ret = new string[10000];         for (int i = 0; i < ret.Length; i++)         {             ret[i] = GenerateRandomString(rng);         }         return ret;     }      static string GenerateRandomString(Random rng)     {         char[] chars = new char[30];         for (int i = 0; i < chars.Length; i++)         {             chars[i] = (char) rng.Next('A', 'z' + 1);         }         return new string(chars);     }      static void CopyAndSort(string[] original, string[] workingSpace)     {         Array.Copy(original, 0, workingSpace, 0, original.Length);         Array.Sort(workingSpace);         // Array.Sort(workingSpace, StringComparer.Ordinal);     } } 

Try it yourself, varying the CopyAndSort method based on whether you specify an ordinal string comparer or not. It's much faster with the ordinal comparer, at least on my box.

like image 84
Jon Skeet Avatar answered Sep 27 '22 18:09

Jon Skeet