Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C - Segmentation Fault with strcmp?

I appear to be getting a segmentation fault somewhere with the strcmp function. I'm still very new to C and I can't see why it gives me the error.

int linear_probe(htable h, char *item, int k){
  int p;
  int step = 1;
  do {
    p = (k + step++) % h->capacity;
  }while(h->keys[p] != NULL && strcmp(h->keys[p], item) != 0);
  return p;
}

gdb:

Program received signal SIGSEGV, Segmentation fault.
0x0000003a8e331856 in __strcmp_ssse3 () from /lib64/libc.so.6

(gdb) frame 1
#1  0x0000000000400ea6 in linear_probe (h=0x603010, item=0x7fffffffde00 "ksjojf", k=-1122175319) at htable.c:52

Edit: insertion code and htable struct

int htable_insert(htable h, char *item){
  unsigned int k = htable_word_to_int(item);
  int p = k % h->capacity;

  if(NULL == h->keys[p]){
    h->keys[p] = (char *)malloc(strlen(item)+1);
    strcpy(h->keys[p], item);
    h->freqs[p] = 1;
    h->num_keys++;
    return 1;
  }

  if(strcmp(h->keys[p], item) == 0){
    return ++h->freqs[p];
  }

  if(h->num_keys == h->capacity){
    return 0;
  }

  if(h->method == LINEAR_P) p = linear_probe(h, item, k);
  else p = double_hash(h, item, k);

  if(NULL == h->keys[p]){
    h->keys[p] = (char *)malloc(strlen(item)+1);
    strcpy(h->keys[p], item);
    h->freqs[p] = 1;
    h->num_keys++;
    return 1;
  }else if(strcmp(h->keys[p], item) == 0){
    return ++h->freqs[p]; 
  }
  return 0;
}

  struct htablerec{
      int num_keys;
      int capacity;
      int *stats;
      char **keys;
      int *freqs;
      hashing_t method;
    };

Thanks

Edit: valgrind - me entering random values to add to table

sdkgj
fgijdfh
dfkgjgg
jdf
kdjfg
==25643== Conditional jump or move depends on uninitialised value(s)
==25643==    at 0x40107E: htable_insert (htable.c:87)
==25643==    by 0x400AB7: main (main.c:75)
==25643== 
fdkjb
kjdfg
kdfg
nfdg
lkdfg
oijfd
kjsf
vmf
kjdf
kjsfg
fjgd
fgkjfg
==25643== Invalid read of size 8
==25643==    at 0x400E0E: linear_probe (htable.c:51)
==25643==    by 0x401095: htable_insert (htable.c:87)
==25643==    by 0x400AB7: main (main.c:75)
==25643==  Address 0x4c342a0 is not stack'd, malloc'd or (recently) free'd
==25643== 
==25643== Invalid read of size 8
==25643==    at 0x400E2B: linear_probe (htable.c:51)
==25643==    by 0x401095: htable_insert (htable.c:87)
==25643==    by 0x400AB7: main (main.c:75)
==25643==  Address 0x4c342a0 is not stack'd, malloc'd or (recently) free'd
==25643== 
==25643== Invalid read of size 1
==25643==    at 0x4A06C51: strcmp (mc_replace_strmem.c:426)
==25643==    by 0x400E3C: linear_probe (htable.c:51)
==25643==    by 0x401095: htable_insert (htable.c:87)
==25643==    by 0x400AB7: main (main.c:75)
==25643==  Address 0x210 is not stack'd, malloc'd or (recently) free'd
==25643== 
==25643== 
==25643== Process terminating with default action of signal 11 (SIGSEGV)
==25643==  Access not within mapped region at address 0x210
==25643==    at 0x4A06C51: strcmp (mc_replace_strmem.c:426)
==25643==    by 0x400E3C: linear_probe (htable.c:51)
==25643==    by 0x401095: htable_insert (htable.c:87)
==25643==    by 0x400AB7: main (main.c:75)
==25643==  If you believe this happened as a result of a stack
==25643==  overflow in your program's main thread (unlikely but
==25643==  possible), you can try to increase the size of the
==25643==  main thread stack using the --main-stacksize= flag.
==25643==  The main thread stack size used in this run was 8388608.
==25643== 
==25643== HEAP SUMMARY:
==25643==     in use at exit: 1,982 bytes in 28 blocks
==25643==   total heap usage: 28 allocs, 0 frees, 1,982 bytes allocated
==25643== 
==25643== LEAK SUMMARY:
==25643==    definitely lost: 0 bytes in 0 blocks
==25643==    indirectly lost: 0 bytes in 0 blocks
==25643==      possibly lost: 0 bytes in 0 blocks
==25643==    still reachable: 1,982 bytes in 28 blocks
==25643==         suppressed: 0 bytes in 0 blocks
==25643== Rerun with --leak-check=full to see details of leaked memory
==25643== 
==25643== For counts of detected and suppressed errors, rerun with: -v
==25643== Use --track-origins=yes to see where uninitialised values come from
==25643== ERROR SUMMARY: 7 errors from 4 contexts (suppressed: 6 from 6)
Segmentation fault (core dumped)

static unsigned int htable_word_to_int(char *word){
  unsigned int result = 0;
  while(*word != '\0'){
    result = (*word++ + 31 * result);
  }
  return result;
}
like image 210
rtheunissen Avatar asked Jan 19 '23 19:01

rtheunissen


1 Answers

Apart from the possibility that the values in your htable may be invalid pointers (i.e., neither NULL nor a pointer to a decent C string), you have a serious problem of encountering an infinite loop if it contains neither a NULL nor the string you're looking for.

For the immediate problem, try changing the code to:

#define FLUSH fflush (stdout); fsync (fileno (stdout))

int linear_probe (htable h, char *item, int k) {
    int pos = k;
    do {
        pos = (pos + 1) % h->capacity;
        printf ("========\n");                    FLUSH;
        printf ("inpk: %d\n",   k);               FLUSH;
        printf ("posn: %d\n",   pos);             FLUSH;
        printf ("cpct: %d\n",   h->capacity);     FLUSH;
        printf ("keyp: %p\n",   h->keys[pos]);    FLUSH;
        printf ("keys: '%s'\n", h->keys[pos]);    FLUSH;
        printf ("item: '%s'\n", item);            FLUSH;
        printf ("========\n");                    FLUSH;
    } while ((pos != k)
          && (h->keys[pos] != NULL)
          && (strcmp (h->keys[pos], item) != 0));
    return pos;
}

Those debug statements should give you an indication as to what's going wrong.


Since you're getting:

inpk: -2055051140
posn: -30
cpct: 113
keyp: 0x100000001

right before the crash, it's evident that someone is passing in a bogus value for k. The modulo operation on negative numbers is implementation defined in the C standard so you're getting a negative value for pos as well. And since h->pos[-30] is going to be undefined behaviour, all bets are off.

Either find and fix the code that's passing in that bogus value (probably an uninitialised variable) or protect your function by changing:

int pos = k;

into:

int pos;
if ((k < 0) || (k >= h->capacity))
    k = 0;
pos = k;

at the start of your function. I'd actually do both but then I'm pretty paranoid :-)


And, based on yet another update (the hash key calculation, if you generate an unsigned int and then blindly use that as a signed int, you've got a good chance of getting negative values:

#include <stdio.h>

int main (void) {
    unsigned int x = 0xffff0000U;
    int y = x;
    printf ("%u %d\n", x, y);
    return(0);
}

This outputs:

4294901760 -65536

My suggestion is to use unsigned integers for values that are clearly meant to be unsigned.

like image 58
paxdiablo Avatar answered Jan 28 '23 13:01

paxdiablo