I wrote this week an extension for the printf family of functions to accept %b to print binary. For that, I used the function register_printf_specifier().
Now I wonder if I can do the same in the scanf family of functions to accept a binary input and write it into a variable.
Is there any extension that allows me to do that?
TL;DR: No. At least no when using glibc.
I've downloaded recent glibc version:
% wget https://ftp.gnu.org/gnu/glibc/glibc-2.29.tar.gz
% tar -xzf glibc-2.29.tar.gz
And grep'ed find, searching for random scanf family function that came to my mind - in this case, it was vfscanf:
% find | grep "vfscanf"
From my experience I know that real implementations are somewhere in -internal, yet I looked through output:
./stdio-common/iovfscanf.c
./stdio-common/isoc99_vfscanf.c
./stdio-common/vfscanf-internal.c
./stdio-common/vfscanf.c
./sysdeps/ieee754/ldbl-opt/nldbl-iovfscanf.c
./sysdeps/ieee754/ldbl-opt/nldbl-isoc99_vfscanf.c
./sysdeps/ieee754/ldbl-opt/nldbl-vfscanf.c
And decided to check ./stdio-common/vfscanf.c, that in fact contained stub to the internal function:
% cat ./stdio-common/vfscanf.c
int
___vfscanf (FILE *s, const char *format, va_list argptr)
{
return __vfscanf_internal (s, format, argptr, 0);
}
Going forward, I've looked thru the file, and reached format parser:
% cat ./stdio-common/vfscanf-internal.c | head -n 1390 | tail -n 20
}
break;
case L_('x'): /* Hexadecimal integer. */
case L_('X'): /* Ditto. */
base = 16;
goto number;
case L_('o'): /* Octal integer. */
base = 8;
goto number;
case L_('u'): /* Unsigned decimal integer. */
base = 10;
goto number;
case L_('d'): /* Signed decimal integer. */
base = 10;
flags |= NUMBER_SIGNED;
goto number;
I've looked at the end of file, and found some finishing case label:
% cat ./stdio-common/vfscanf-internal.c | tail -n 60
++done;
}
}
break;
case L_('p'): /* Generic pointer. */
base = 16;
/* A PTR must be the same size as a `long int'. */
flags &= ~(SHORT|LONGDBL);
if (need_long)
flags |= LONG;
flags |= READ_POINTER;
goto number;
default:
/* If this is an unknown format character punt. */
conv_error ();
}
}
/* The last thing we saw int the format string was a white space.
Consume the last white spaces. */
if (skip_space)
{
do
c = inchar ();
while (ISSPACE (c));
ungetc (c, s);
}
errout:
/* Unlock stream. */
UNLOCK_STREAM (s);
scratch_buffer_free (&charbuf.scratch);
if (__glibc_unlikely (done == EOF))
{
if (__glibc_unlikely (ptrs_to_free != NULL))
{
struct ptrs_to_free *p = ptrs_to_free;
while (p != NULL)
{
for (size_t cnt = 0; cnt < p->count; ++cnt)
{
free (*p->ptrs[cnt]);
*p->ptrs[cnt] = NULL;
}
p = p->next;
ptrs_to_free = p;
}
}
}
else if (__glibc_unlikely (strptr != NULL))
{
free (*strptr);
*strptr = NULL;
}
return done;
}
And the code that finished the function. This means, all format specifiers are constant for one of scanf-family functions, and this implies that you can't register new handler without messing with the large clusterf..k in glibc source (that of course won't be portable).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With