Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What constitutes 'array access' in the context of algorithms?

Below is an LSD Radix sort implementation in Java from a textbook to sort an array of strings where each string contains exactly W characters.

I want to count the number of array accesses during runtime. I've read that LSD sort is supposed to require n * c array accesses where n is the number of strings and c the amount of characters in each string. However, the algorithm below accesses more than one array several times. If I increment a counter at each of these I'll end up with a significant factor of nc.

So what exactly constitutes 'array access' in the context of algorithms? Is there only one instance of array access that is considered more significant that I should count here, or is this example in fact an inefficient implementation that uses more array access than necessary?

 public int lsdSort(String[] array, int W) {
  int access = 0;
  // Sort a[] on leading W characters.
  int N = array.length;
  String[] aux = new String[N];
  for (int d = W-1; d >= 0; d--)
  { // Sort by key-indexed counting on dth char.
    int[] count = new int[R+1]; // Compute frequency counts.
    for (int i = 0; i < N; i++) {
        count[array[i].charAt(d) + 1]++;
    }
    for (int r = 0; r < R; r++) {
        // Transform counts to indices.
        count[r+1] += count[r];
    }
    for (int i = 0; i < N; i++) {
        // Distribute.
        aux[count[array[i].charAt(d)]++] = array[i];

    }  
    for (int i = 0; i < N; i++) // Copy back.
        array[i] = aux[i];
  }

  return access;
  }
like image 787
Lawyerson Avatar asked Dec 28 '22 12:12

Lawyerson


1 Answers

I've read that LSD sort is supposed to require n times c array accesses where n is the number of strings and c the amount of characters in each string.

Are you sure you haven't read that it's O(nc)? That's not the same thing at all. It's big-O notation. The point isn't to determine the exact number of array accesses - it's to talk about how it grows (or rather, the limit of how it can grow) as n or c increase. In this case it increases linearly - if you increase n by a factor of 1000, you would only expect the total cost to grow by a factor of 1000 too... whereas if it were an O(n2c) algorithm instead, it might grow by a factor of 1,000,000. (Strictly speaking any O(nc) algorithm is also O(n2c) due to it only being a limit, but let's not get into that.)

like image 137
Jon Skeet Avatar answered Feb 20 '23 08:02

Jon Skeet