I am trying to create a library that handles big integer arithmetic. Big integers are stored in a struct:
typedef struct BigInt BigInt;
struct BigInt
{
uint32_t size;
uint32_t *data;
};
the first member is an uint32_t containing the length of the number and the second member is a pointer pointing to the actual number data (stored in two's complement). I have written a simple toHex(BigInt *a) function that allocates memory, prints the hexadecimal value of the big integer to the string, and returns the address.
In my main loop, I have the following:
int main(int argc, char *argv[])
{
char *ap, *bp;
BigInt *a = fromUInt32(0x7fffffff), *b = fromUInt32(1), *c = fromUInt32(0x80000000);
_add(a, b);
ap = toHex(a);
bp = toHex(c);
printf("%s\n", ap);
printf("%s\n%s\n", ap, bp);
printf("%s\n%s\n", ap, bp);
free(ap);
free(bp);
deleteBigInt(a);
deleteBigInt(b);
deleteBigInt(c);
}
which, curiously enough, prints
0000000080000000
0
0000000080000000
0000000080000000
0000000080000000
So the second printf statements print something different for ap than the first and third printf statement. It seems the first printf statement is correct, and the second one is messing up. I have stepped through my code with GDB and after the evaluation of toHex, ap points to the string "0000000080000000", terminated by a null pointer.
I am completely baffled. As far as I can see, the possibilities are:
1. I have run into undefined behaviour for some weird reason.
2. In _add I call a routine written in x86 assembly code, there may be an error in it (but I do adhere to GCC's calling conventions by preserving esi, edi, ebx, ebp, and esp).
3. There is a bug in printf, which seems very unlikely.
Also I have an obvious "memory leak" (quoted because the opinions on what a memory leak exactly is seem to differ) by not freeing the memory allocated by toHex, but this should not matter.
My toHex function was requested by Sourav Ghosh, and is as follows:
char numToHex[] = { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f' };
char *toHex(BigInt *a)
{
char *result, *ptr;
// allocate enough space for 8 characters for each uint32_t and 1 terminating 0
ptr = result = malloc(a->size * 8 + 1);
// loop over the uint32_t's stored in a->data
// (there are a->size of them)
for (uint32_t i = 0; i < a->size; i++)
// parse 8 blocks of 4 bits
for (uint32_t j = 0; j < 8; j++)
// grab the right bits and convert them to a hex digit
*(ptr++) = numToHex[(a->data[i] >> ((7 - j) * 4)) & 0xf];
// add a terminating zero byte
*ptr = 0;
return result;
}
I have isolated this weird behaviour in a program of ~100 lines of C + ~70 lines of assembly. Compiling can be done with
nasm -f elf -s <AssemblyName>.asm
gcc <CFile>.c <AssemblyName>.o -o <OutputProgram> -m32 -std=c99 -g
The code is uncommented and meant for people who want to inspect the behaviour for themselves.
EDIT: Jan Spurny and Matt McNabb urged me to use Valgrind. Valgrind says: Invalid read of size 1 at 0x40A5685: vfprintf (vfprintf.c:1655) by 0x40AA7FE: printf (printf.c:34) by 0x4075904: (below main) (libc-start.c:260) Address 0x42121af is 1 bytes before a block of size 17 alloc'd at 0x40299D8: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so) by 0x804887D: toHex (weird.c:107) by 0x8048565: main (weird.c:30)
But this doesn't make sense, as I set result to malloc in toHex, and didn't change anything after that. My bet now is that some register is getting corrupted in the assembly function. Edit2: After checking with GDB, I can see that no registers are corrupted. I am still clueless.
Modify print() method to print on the same line The print method takes an extra parameter end=” “ to keep the pointer on the same line. The end parameter can take certain values such as a space or some sign in the double quotes to separate the elements printed in the same line.
Use triple quotes to create a multiline string It is the simplest method to let a long string split into different lines. You will need to enclose it with a pair of Triple quotes, one at the start and second in the end. Anything inside the enclosing Triple quotes will become part of one multiline string.
Use the multiplication operator * to repeat a string multiple times. Multiply a string with the multiplication operator * by an integer n to concatenate the string with itself n times. Call print(value) with the resultant string as value to print it.
The reduce
function has a bug:
while (i < a->size && !(a->data[i])) i++;
if (a->data[i] & SIGNBIT) i--;
If the i < a->size
condition is hit, then a->data[i]
accesses out of bounds, causing undefined behaviour. The other branch of reduce
has the same problem
There's a bug in the _add
function (although this is not triggered in your test case):
void *k = realloc(a->data, b->size * 4);
memmove((void *)(a->data + displacement), (void *)a->data, a->size * 4);
// ....other code using `a->data`
After realloc
, a->data
becomes indeterminate so it causes undefined behaviour to use it. This could explain your symptoms as a future allocation might re-use the same freed block which a->data
is still pointing to.
Maybe you meant to also have a line a->data = k;
after this?
To get good help with debugging your code it would be great if you could do the following:
*alloc
-family functions and exit if NULL
is returned. Otherwise you get undefined behaviour (it's not reliable to expect a segfault).newAddress
to check it actually returns what you expected in your test case.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With