Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Code sequences for TLS on ARM

The ELF Handling For Thread-Local Storage document gives assembly sequences for the various models (local exec/initial exec/general dynamic) for various architectures. But not ARM -- is there anywhere I can see such code sequences for ARM? I'm working on a compiler and want to generate code that will operate properly with the platform linkers (both program and dynamic).

For clarity, let's assume an ARMv7 CPU and a pretty new kernel and glibc (say 3.13+ / 2.19+), but I'd also be interested in what has to change for older hw/sw if that's easy to explain.

like image 936
mwhudson Avatar asked Apr 23 '15 08:04

mwhudson


1 Answers

I don't exactly understand what you want. However, the assembler sequences (for ARMv6+ and a capable kernel) are,

mrc p15, 0, rX, c13, c0, 2  @ get the user r/w register

This is called TPIDRURW in some ARM manuals. Your TLS tables/structure must be parented from this value (probably a pointer). Using the mcr is faster, but you can also call the helper (see below) if you don't set HWCAP_TLS in your ELF (which can be used on all ARM CPUs supported by Linux).

The intent of address 0xffff0fe8 seems to be that you can use those 4-bytes instead of using the above assembler directly with (rX == r0) as maybe it is different for some machine somewhere.


It is dependent on the CPU type. There is a helper in the vector page @0xffff0fe0 in entry-armv.S; it is in the process/thread structure if the hardware doesn't support it. Documentation is in kernel_user_helpers.txt

Usage example:

typedef void * (__kuser_get_tls_t)(void);
#define __kuser_get_tls (*(__kuser_get_tls_t *)0xffff0fe0)

void foo()
{
    void *tls = __kuser_get_tls();
    printf("TLS = %p\n", tls);
}

You do a syscall to set the TLS stuff. clone is a way to setup a thread context. The thread_info holds all register for a thread; it may share an mm (memory management or process memory view) with other task_struct. Ie, the thread_info has a tp_value for each created thread.

Here is a dicussion of the ARM implementation. ELF/nptl/glibc and Linux kernel are all involved (and/or search terms to investigate more). The syscall for get_tls() was probably too expensive and the current mainline has a vector page helper (mapped by all threads/processes).

Some glibc source, tls-macros.h, tlsdesc.c, etc. Most likely a full/concise answer will depend on the version of,

  1. Your ARM CPU.
  2. Your Linux kernel.
  3. Your glibc.
  4. Your compiler (and flags!).
like image 126
artless noise Avatar answered Oct 03 '22 06:10

artless noise