Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is the execution time of this unique string function reduced from the naive O(n^2) approach?

Tags:

The generic algorithm to deduce if a string contains all unique characters (and that does not use any other data structures) says to go through the string, iterating each letter against the entire string searching for a match. This approach is O(n^2).

The approach below (written in C) uses an offset for the iterating over the string part, since for example in a short string there is no reason to test the last character with the first character as the first character already did that.

My question is this: Is the runtime of the algorithm then O(n!) or something like O(nlogn)?

#include <stdio.h>  int strunique(const char *str) {     size_t offset = 1;     char *scout = (char *)str, *start;      for (; *scout != '\0'; ++scout, ++offset)         for (start = (char *)str + offset; *start != '\0'; ++start)             if (*start == *scout)                 return 0;      return 1; }  int main(void) {     printf("%d\n", strunique("uniq"));     printf("%d\n", strunique("repatee"));      return 0; } 
like image 643
quickerque Avatar asked May 21 '15 00:05

quickerque


2 Answers

In addition to other answers, I'd like to point out that the problem can be solved in O(1) without additional memory, and without modifying the contents of the input string.

First, do strnlen(str, 256). If you get more than 255, return 0. By the pigeonhole principle, some character must occur more than once. This operation takes only O(1) since we examine only a bounded prefix of the string.

Now, the string is shorter than a constant (256), so use any naive algorithm to finish in only O(1) additional time.

like image 107
Atsby Avatar answered Sep 22 '22 22:09

Atsby


No, it's still O(n^2). You just slightly improved the constant. You still have to make two loops- basically the naive count the loops way of measuring big O time should tell you this.

Also, there is no such thing as O(n+1/2n). Big O notation is to give you an idea of the order of magnitude something should take. n+1/2n= 1.5n. Since big O drops all constant factors, that would just be n.

You can beat O(n^2) without extra memory though. If nothing else you can sort the strings by ascii value (nlog(n) time) and then walk the array looking for dupes (n time) for O(n+nlogn)=O(nlogn) time. There's probably other tricks as well.

Note that the sorting approach may not give better runtime though- the naive way has a best case runtime of 1, while a sort first algorithm has to sort, so it has a best case of nlogn. So best big-O time may not be the best choice.

like image 45
Gabe Sechan Avatar answered Sep 19 '22 22:09

Gabe Sechan