Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the best algorithm for removing every kth character in a string of length N?

I know there is a naive algorithm that is order N and I'm about convinced that's the only one to use. Is there any other that is:

  1. Asymptotically better
  2. Pipelineable i.e. RAW,WAR friendly
  3. Multithreadable.

I'm sure there is one for (1) but I'm not so sure about (2) and (3). If you also want to mention why this is a good interview question. I'd love to know that as well.

like image 352
West Avatar asked Nov 10 '22 05:11

West


1 Answers

The obvious method is easy to do in-place

void remove_every_kth(char *s, size_t len, int k)
{
    // UNTESTED CODE, there might be an off-by-one or a wrong loop boundary
    if (k < len)
        return;

    const char *endp = s + len;
    size_t offset = 1;
    // we skip the s[i] = s[i] memmove at offset=0.
    for (s+=k-1 ; s + offset < endp-(k-1) ; s+=k-1) {
        // every iteration copies k-1 characters, and advances s by k-1
        memmove(s, s+offset, k-1);
        offset++;
    }
    size_t lastchunk = endp - (s+offset);  // some number (less than k) of bytes left in the input
    memmove(s, s+offset, lastchunk);
    s[lastchunk] = '\0';
}
// equivalently, we could have kept a pointer to the read position,
// like const char* read = s+offset;
// and incremented it by k, while incrementing s by k-1


    // for (int i=0 ; i < k ; i++)  // implicit length string
    //    if (!s[i]) return;    // check for length < k

Since k is constant, you can calculate where to find the input character for any output position. out[i] = in[i + i/k]. There's nothing data-dependent, so this is certainly multithreadable if you don't need to do it in-place, and you have the length of the string in advance. Just farm out the necessary memcpy calls to multiple threads. (I wrote the simple version with memmove instead of a char-pointer loop to make this more obvious, as well as for much better performance with medium to large k. It probably sucks for small k.)

For multithreaded in-place, there's something to gain if k is large, so that even towards the end of a long string, the source and destination of most of the copying is within the same chunk. Each work unit does:

  • don't touch the first offset = chunk_number * chunk_size / k bytes, the previous work unit needs to read them.
  • save the second offset bytes to a temp array.
  • memmove(chunk + offset, chunk + offset*2, chunk_size - offset) (i.e. do the memmove for all the bytes that aren't needed by the previous work unit).
  • spin-wait for the previous chunk to be marked as done by the thread handling it. (Prob. with a separate data structure, because just watching the data at the last overlapping position won't work. It might be overwritten with the same value.)
  • copy from the temp buffer to the beginning of the chunk, where the data belongs
  • mark the work chunk as completed.

For small k, in-place multithread is futile, because most of the bytes in a chunk need to be overwritten with bytes from later chunks. (very large chunks help some.)

like image 136
Peter Cordes Avatar answered Nov 14 '22 21:11

Peter Cordes