Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding unique elements in an string array in C

C bothers me with its handling of strings. I have a pseudocode like this in my mind:

char *data[20]; 

char *tmp; int i,j;

for(i=0;i<20;i++) {
  tmp = data[i]; 
  for(j=1;j<20;j++) 
  {
    if(strcmp(tmp,data[j]))
      //then except the uniqueness, store them in elsewhere
  }
}

But when i coded this the results were bad.(I handled all the memory stuff,little things etc.) The problem is in the second loop obviously :D. But i cannot think any solution. How do i find unique strings in an array.

Example input : abc def abe abc def deg entered unique ones : abc def abe deg should be found.

like image 462
LuckySlevin Avatar asked May 10 '10 16:05

LuckySlevin


1 Answers

You could use qsort to force the duplicates next to each other. Once sorted, you only need to compare adjacent entries to find duplicates. The result is O(N log N) rather than (I think) O(N^2).

Here is the 15 minute lunchtime version with no error checking:

  typedef struct {
     int origpos;
     char *value;
  } SORT;

  int qcmp(const void *x, const void *y) {
     int res = strcmp( ((SORT*)x)->value, ((SORT*)y)->value );
     if ( res != 0 )
        return res;
     else
        // they are equal - use original position as tie breaker
        return ( ((SORT*)x)->origpos - ((SORT*)y)->origpos );
  }

  int main( int argc, char* argv[] )
  {
     SORT *sorted;
     char **orig;
     int i;
     int num = argc - 1;

     orig = malloc( sizeof( char* ) * ( num ));
     sorted = malloc( sizeof( SORT ) * ( num ));

     for ( i = 0; i < num; i++ ) {
        orig[i] = argv[i + 1];
        sorted[i].value = argv[i + 1];
        sorted[i].origpos = i;
        }

     qsort( sorted, num, sizeof( SORT ), qcmp );

     // remove the dups (sorting left relative position same for dups)
     for ( i = 0; i < num - 1; i++ ) {
        if ( !strcmp( sorted[i].value, sorted[i+1].value ))
           // clear the duplicate entry however you see fit
           orig[sorted[i+1].origpos] = NULL;  // or free it if dynamic mem
        }

     // print them without dups in original order
     for ( i = 0; i < num; i++ )
        if ( orig[i] )
           printf( "%s ", orig[i] );

     free( orig );
     free( sorted );
  }
like image 192
Mark Wilkins Avatar answered Oct 12 '22 22:10

Mark Wilkins