Not considering the performance and not going into CMIS RTOS specifics (which are unknown to me), you can allocate space needed for your variables - either on heap or as static or global variable - I would suggest to have an array of structures. Then, when you create thread, pass the pointer to the next not used structure to your thread function.

In case of static or global variable, it would be good if you know how many threads are working in parallel for limiting the size of preallocated memory.

EDIT: Added sample of TLS implementation based on pthreads:

#include <pthread.h>

#define MAX_PARALLEL_THREADS 10

static pthread_t threads[MAX_PARALLEL_THREADS];
static struct tls_data tls_data[MAX_PARALLEL_THREADS];
static int tls_data_free_index = 0;

static void *worker_thread(void *arg) {
    static struct tls_data *data = (struct tls_data *) arg;

    /* Code omitted. */
}

static int spawn_thread() {
    if (tls_data_free_index >= MAX_PARALLEL_THREADS) {
        // Consider increasing MAX_PARALLEL_THREADS
        return -1;
    }

    /* Prepare thread data - code omitted. */

    pthread_create(& threads[tls_data_free_index], NULL, worker_thread, & tls_data[tls_data_free_index]);
}
Answer from smbear on Stack Overflow
🌐
GNU
gcc.gnu.org › onlinedocs › gcc › ARM-Options.html
ARM Options (Using the GNU Compiler Collection (GCC))
Specify the access model for the thread local storage pointer. The model ‘soft’ generates calls to __aeabi_read_tp.
🌐
GitHub
github.com › JetBrains › kotlin-native › issues › 785
undefined reference to `__aeabi_read_tp' · Issue #785 · JetBrains/kotlin-native
August 9, 2017 - I googled about __aeabi_read_tp a bit and find out that we can tell compiler not to use this instructions: -mtp parameter for gcc and "probabily" -femulated-tls parameter for clang.
Author   JetBrains
🌐
GitHub
github.com › ARM-software › abi-aa › blob › main › addenda32 › addenda32.rst
abi-aa/addenda32/addenda32.rst at main · ARM-software/abi-aa
(v2.01, r2.02) In (new) rtabi32-section4-3-5, we define __aeabi_read_tp() that returns the thread pointer denoted by $tp in Linux for Arm static (initial exec) model, of this specification.
Author   ARM-software
Top answer
1 of 3
3

Not considering the performance and not going into CMIS RTOS specifics (which are unknown to me), you can allocate space needed for your variables - either on heap or as static or global variable - I would suggest to have an array of structures. Then, when you create thread, pass the pointer to the next not used structure to your thread function.

In case of static or global variable, it would be good if you know how many threads are working in parallel for limiting the size of preallocated memory.

EDIT: Added sample of TLS implementation based on pthreads:

#include <pthread.h>

#define MAX_PARALLEL_THREADS 10

static pthread_t threads[MAX_PARALLEL_THREADS];
static struct tls_data tls_data[MAX_PARALLEL_THREADS];
static int tls_data_free_index = 0;

static void *worker_thread(void *arg) {
    static struct tls_data *data = (struct tls_data *) arg;

    /* Code omitted. */
}

static int spawn_thread() {
    if (tls_data_free_index >= MAX_PARALLEL_THREADS) {
        // Consider increasing MAX_PARALLEL_THREADS
        return -1;
    }

    /* Prepare thread data - code omitted. */

    pthread_create(& threads[tls_data_free_index], NULL, worker_thread, & tls_data[tls_data_free_index]);
}
2 of 3
2

I believe this is possible, but probably tricky.

Here's a paper describing how __thread or thread_local behaves in ELF images (though it doesn't talk about ARM architecture for AEABI):

https://www.akkadia.org/drepper/tls.pdf

The executive summary is:

  • The linker creates .tbss and/or .tdata sections in the resulting executable to provide a prototype image of the thread local data needed for each thread.
  • At runtime, each thread control block (TCB) has a pointer to a dynamic thread-local vector table (dtv in the paper) that contains the thread-local storage for that thread. It is lazily allocated and initialized the first time a thread attempts to access a thread-local variable. (presumably by __aeabi_read_tp())
  • Initialization copies the prototype .tdata image and memsets the .tbss image into the allocated storage.
  • When source code access thread-local variables, the compiler generates code to read the thread pointer from __aeabi_read_tp(), and do all the appropriate indirection to get at the storage for that thread-local variable.

The compiler and linker is doing all the work you'd expect it to, but you need to initialize and return a "thread pointer" that is properly structured and filled out the way the compiler expects it to be, because it's generating instructions directly to follow the hops.

There are a few ways that TLS variables are accessed, as mentioned in this paper, which, again, may or may not totally apply to your compiler and architecture:

http://www.fsfla.org/~lxoliva/writeups/TLS/RFC-TLSDESC-x86.txt

But, the problems are roughly the same. When you have runtime-loaded libraries that may bring their own .tbss and .tdata sections, it gets more complicated. You have to expand the thread-local storage for any thread that suddenly tries to access a variable introduced by a library loaded after the storage for that thread was initialized. The compiler has to generate different access code depending on where the TLS variable is declared. You'd need to handle and test all the cases you would want to support.

It's years later, so you probably already solved or didn't solve your problem. In this case, it is (was) probably easiest to use your OS's TLS API directly.

🌐
GitHub
github.com › ARM-software › abi-aa › blob › main › rtabi32 › rtabi32.rst
abi-aa/rtabi32/rtabi32.rst at main · ARM-software/abi-aa
In Addenda32 (section 'Linux for Arm static (initial exec) model'), the description of thread-local storage addressing refers to the thread pointer denoted by $tp but does not specify how to obtain its value. void *__aeabi_read_tp(void); /* return the value of $tp */
Author   ARM-software
🌐
Narkive
mono-list.ximian.narkive.com › iUcwG5dJ › undefined-reference-to-aeabi-read-tp
[Mono-list] Undefined reference to __aeabi_read_tp
#ifdef HAVE_AEABI_READ_TP void __aeabi_read_tp (void); #endif Now I am searching for a library which contains the implementation of this function. The list of symbols of any generated object in buildroot-target does not include the __aeabi_read_tp function. Does someone know how to solve this ...
🌐
Segger
kb.segger.com › Thread-Local_Storage
Thread-Local Storage - SEGGER Knowledge Base
In the case for Arm, there is no dedicated thread pointer. Instead the function __aeabi_read_tp is used by the compiler and required to be implemented.
🌐
GitHub
github.com › Infineon › clib-support › blob › master › README.md
clib-support/README.md at master · Infineon/clib-support
__aeabi_read_tp (FreeRTOS) _reclaim_reent · __iar_system_Mtxinit (FreeRTOS) __iar_system_Mtxlock (FreeRTOS) __iar_system_Mtxunlock (FreeRTOS) __iar_system_Mtxdst (FreeRTOS) __iar_file_Mtxinit · __iar_file_Mtxlock · __iar_file_Mtxunlock · __iar_file_Mtxdst ·
Author   Infineon
Find elsewhere
🌐
GitHub
github.com › llvm › llvm-project › issues › 31117
arm __aeabi_read_tp call does not honour -mlong-calls · Issue #31117 · llvm/llvm-project
Found while building LLDB 4.0.0 in the FreeBSD/arm base system. It's necessary to build with -mlong-calls, but Clang generates bl calls to __aeabi_read_tp.
Author   llvm
🌐
GitHub
github.com › llvm › llvm-project › issues › 51671
-mtp=cp15 generates __aeabi_read_tp in Thumb2 mode · Issue #51671 · llvm/llvm-project
Bugzilla Link 52329 Resolution FIXED Resolved on Nov 08, 2021 08:41 Version trunk OS Windows NT Blocks #4440 #51489 CC @arndb,@nickdesaulniers,@smithp35,@tstellar Fixed by commit(s) d7e089f ed38280 Extended Description This impacts the L...
Author   llvm
🌐
FreeBSD
lists.freebsd.org › pipermail › freebsd-arm › 2007-February › 000409.html
__aeabi_read_tp missing symbol
February 27, 2007 - > Googling so far has taught me that this is GCC's work and well the > following: > > `-mtp=NAME' > Specify the access model for the thread local storage pointer. > The valid models are `soft', which generates calls to > `__aeabi_read_tp', `cp15', which fetches the thread pointer from > `cp15' ...
🌐
Narkive
gcc.gcc.gnu.narkive.com › cO8Hk5uM › tls-support-on-arm
TLS support on ARM
If you want to use a slow implementation, write an assembly wrapper which saves additional registers. This might be the initial plan. But is this true? Without clobbering the registers r1-r3 the compiler generates something like this: ldr r3,[pc, #48] bl __aeabi_read_tp adds r7, r0, r3 ..
🌐
Openwall
openwall.com › lists › musl › 2017 › 08 › 31 › 9
musl - simplification of __aeabi_read_tp
September 1, 2017 - The interesting point is that neither __a_gettp_cp15() (only one instruction and a return) nor kuser_get_tls (according to the kernel spec) clobber any registers. The only reason for saving the registers is the indirection via the C-function __aeabi_read_tp_c(), where the compiler is allowed to clobber r0-r3.
🌐
GitHub
github.com › llvm › llvm-project › issues › 37742
Feature request: inline `__aeabi_read_tp` for ARMv7a ELF TLS · Issue #37742 · llvm/llvm-project
Bugzilla Link 38394 Version trunk OS Linux Extended Description Clang generates a call to an __aeabi_read_tp function to access an arm32 ELF TLS variable using the Initial-Exec or Local-Exec access models. GCC also generates a call to th...
Author   llvm
🌐
Openwall
openwall.com › lists › musl › 2017 › 08 › 31 › 10
musl - Re: simplification of __aeabi_read_tp
August 31, 2017 - If you look at the commit text for commit 29237f7f5c09c436825a7a12b68ab4143b0ebd1f which added the indirection through C code, one of the goals was to make the arm target fdpic-ready (to support shareable text on cortex-m). > diff --git a/src/thread/arm/__aeabi_read_tp.S > b/src/thread/arm/__aeabi_read_tp.S > new file mode 100644 > index 0000000..897b4f8 > --- /dev/null > +++ b/src/thread/arm/__aeabi_read_tp.S > @@ -0,0 +1,22 @@ > +.syntax unified > +.global __a_gettp_ptr > +.hidden __a_gettp_ptr > +.global __aeabi_read_tp > +.type __aeabi_read_tp,%function > +__aeabi_read_tp: > + > +#if ((__A
🌐
GitHub
github.com › TImada › libaeabi32
GitHub - TImada/libaeabi32: A library to provide __aeabi_* functions for aarch32
__aeabi_read_tp function to make GCC to use TPIDRURW(User Read/Write Thread Pointer ID Register) rather than TPIDRURO
Starred by 2 users
Forked by 2 users
Languages   C 94.3% | HTML 5.4% | C 94.3% | HTML 5.4%
🌐
GitHub
github.com › microsoft › checkedc-llvm › blob › master › test › CodeGen › ARM › aeabi-read-tp.ll
checkedc-llvm/test/CodeGen/ARM/aeabi-read-tp.ll at master · microsoft/checkedc-llvm
Thiis is a *deprecated* repo that contains a version of LLVM that was being modified to support Checked C. We have moved to a single (mono) repo setup, following the lead of the LLVM community. See https://github.com/Microsoft/checkedc-clang instead. - checkedc-llvm/test/CodeGen/ARM/aeabi-read-tp....
Author   microsoft
🌐
Narkive
musl.openwall.narkive.com › 8x2d0frf › simplification-of-aeabi-read-tp
[musl] simplification of __aeabi_read_tp
Register saving code that avoids "pop {lr}" (which is not supported by ARMv6-M) and "pop {pc}" (which is not supported by ARMv4T) is very ugly, therefore I took a closer look at its internals and discovered the following: __aeabi_read_tp() calls __aeabi_read_tp_c() which inlines the function __pthread_self().
🌐
OSDev.org
forum.osdev.org › board index › operating system development › os design & theory
OSDev.org • View topic - Is thread local storage a good solution?
March 20, 2020 - OSwhatever wrote:How is it with x86, do you have the possibility to provide a function call in order to obtain the tp pointer for the init exec TLS area? I'm in kind of a luck since I'm working on ARM, there you have the option to either use a HW cp15 register or an ABI function call to __aeabi_read_tp in order to get the pointer.