Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

GCC warns about gettid() syscall wrapper, with glibc 2.30-8

man page and SO post#1/SO post#2 all suggest that gettid() was implemented in glibc 2.30. I think I am using GLIBC 2.30-8, according to ldd --version, but gcc still complains - warning: implicit declaration of function ‘gettid’; did you mean ‘getgid’? [-Wimplicit-function-declaration]. I can ignore the warning, and the program runs fine.

The header I tried to use with gettid() was <sys/types.h>, following the man page. Did I miss something?

Calling syscall(SYS_gettid) with header <sys/syscall.h> triggers no warning from gcc.

like image 540
QnA Avatar asked May 23 '20 23:05

QnA


1 Answers

Note: This is so new, it shouldn't be relied upon without some conditions. And, can probably never be relied upon.

See below.


Until February 2019, this is what the manpage said:

From: man gettid:

Note: There is no glibc wrapper for this system call; see NOTES.

And, from NOTES:

Glibc does not provide a wrapper for this system call; call it using syscall(2).

See the remainder of the NOTES section for more details. It's because they want you to use something else that is pthread_self compatible.

It's a glibc quirk/philosophy that I, personally, don't agree with.

Because, it is sometimes necessary to use the real/linux tid value.


The new way is [as of February 2019, per Joseph]:

#define _GNU_SOURCE
#include <unistd.h>

to get a gettid declaration.

But ...

This can never be relied upon [without a #if/#ifdef] because the gettid syscall has been around since linux 2.4, circa 2004.

Thus, it has taken glibc 15 years to add a wrapper function!?!? :-( IMO, too little, too late.

When using the new method, it breaks backwards compatibility with all distros that have been published in that 15 year period.

So, ignoring this and continuing to use syscall is the way to go. It's the simplest way to maintain forward/backward compatibility.

We could create the wrapper, but it's more complex [something like]:

#if __GLIBC_PREREQ(2,30)
#define _GNU_SOURCE
#include <unistd.h>

#else

#include <sys/syscall.h>

pid_t
gettid(void)
{

    return syscall(SYS_gettid);
}
#endif

In fact, it's even more complex than that because we have to ensure we have a definition of __GLIBC_PREREQ [and __GLIBC__ and __GLIBC_MINOR__] before we start including things.

And, getting that right is problematic. So, one would probably be forced to always do:

#define _GNU_SOURCE
#include <unistd.h>

And, then add the #if afterwards. Or, whatever the actual arcane way would be [which I can't personally test because I'm on a system that uses the old method].

It's messy and [IMO] not worth the trouble.


And, it's even worse.

Because the old method has been around for so long, many apps have already defined their own gettid wrapper function. So, the new declaration may collide with this.

So, without doing anything, a developer could recompile their [previously] working code and the build will now error out.

To prevent this, unistd.h should probably only declare gettid if the user has put some #define before including it (similar to _GNU_SOURCE) like:

#define _DECLARE_GETTID
#include <unistd.h>

And, because the new method breaks 15 years of compatibility, the manpage needs to document this fact and explain the old method.

Because, developers trying to write cross platform code, across multiple kernel versions, and multiple glibc versions, need to know about this.


UPDATE:

There's very good reason it was not added before - it was not a supported abstraction for application use.

That still doesn't give them license to break backwards compatibility.

And, the missing functionality had to be synthesized for special apps (e.g. those that worked closely with a given device driver)

For a long time

15+ years ...

glibc was entertaining the idea of possible thread models where the kernel tid would not be an invariant for the thread lifetime.

The kernel knows nothing of pthread_t. It only knows about tid. That is how an application has to communicate with the kernel (e.g. tgkill).

There is nothing in the kernel's model that would change a tid midstream, any more than it would change a pid on a process in the middle of its execution.

That's because the kernel treats a tid just like a pid for purposes of scheduling. There is a task struct [indexed/identified by pid/tid] for each process and each thread.

To do otherwise, would require a major overhaul of all kernel schedulers, task/thread hierarchies, etc. So, glibc couldn't possibly do this alone. And, to what end?

After the clone syscall to create a thread [with the option to use the same address space], it's largely an invisible distinction, and a thread gets equal weight when being scheduled [based on scheduler and task priority].

That is one of the great benefits of the kernel's approach: that tasks/threads are first class citizens when being scheduled [unlike WinX].

Side note: I had to program a realtime embedded H/W H.264 video encoder and had to deal with the precursor to nptl [namely linuxthreads(?)] and it caused serious issues regarding throughput and latency. Fortunately, nptl came along before we had to ship. The difference was night and day.

It was only after relatively recent discussions that it was deemed ok.

The kernel [and/or POSIX] define the semantics, not glibc. glibc gets to follow and implement, not dictate.

And, because the glibc people were being recalcitrant [and obnoxious], and took 15 years to decide [the obvious], IMO, they've lost any bargaining rights on this matter.

like image 79
Craig Estey Avatar answered Oct 03 '22 13:10

Craig Estey