When is CLREX actually needed on ARM Cortex M7?

Tags:

I found a couple of places online which state that CLREX "must" be called whenever an interrupt routine is entered, which I don't understand. The docs for CLREX state (added the numbering for easier reference):

(1) Clears the local record of the executing processor that an address has had a request for an exclusive access.

(2) Use the CLREX instruction to return a closely-coupled exclusive access monitor to its open-access state. This removes the requirement for a dummy store to memory.

(3) It is implementation-defined whether CLREX also clears the global record of the executing processor that an address has had a request for an exclusive access.

I don't understand pretty much anything here.

I had the impression that writing something along the lines the example in the docs was enough to guarantee atomicity:

    MOV r1, #0x1                ; load the ‘lock taken’ value
try:                                                       <---\
    LDREX r0, [LockAddr]        ; load the lock value          |
    CMP r0, #0                  ; is the lock free?            |
    STREXEQ r0, r1, [LockAddr]  ; try and claim the lock       |
    CMPEQ r0, #0                ; did this succeed?            |
    BNE try                     ; no - try again   ------------/
    ....                        ; yes - we have the lock

Why should the "local record" need to be cleared? I thought that LDREX/STREX are enough to guarantee atomic access to an address from several interrupts? I.e. GCC for ARM compiles all C11 atomic functions using LDREX/STREX and I don't see CLREX being called anywhere.
What "requirement for a dummy store" is the second paragraph referring to?
What is the difference between the global record and a local record? Is global record needed for multi-core scenarios?

349

asked Jul 03 '18 20:07

Lou

Video Answer

2 Answers

Taking (and paraphrasing) your three questions separately:

1. Why clear the access record?

When strict nesting of code is enforced, such as when you're working with interrupts, then CLREX is not usually required. However, there are cases where it's important. Imagine you're writing a context switch for a preemptive operating system kernel, which can asynchronously suspend a running task and resume another. Now consider the following pathological situation, involving two tasks of equal priority (A and B) manipulating the same shared resource using LDREX and STREX:

Task A      Task B
  ...
 LDREX
-------------------- context switch
             LDREX
             STREX   (succeeds)
              ...
             LDREX
-------------------- context switch
 STREX               (succeeds, and should not)
  ...

Therefore the context switch must issue a CLREX to avoid this.

2. What 'requirement for a dummy store' is avoided?

If there wasn't a CLREX instruction then it would be necessary to use a STREX to relinquish the exclusive-access flag, which involves a memory transaction and is therefore slower than it needs to be if all you want to do is clear the flag.

3. Is the 'global record' for multi-core scenarios?

Yes, if you're using a single-core machine, there's only one record because there's only one CPU.

164

answered Dec 06 '22 18:12

cooperised

Actually CLREX isn't needed for exceptions/interrupts on the M7, it appears to only be included for compatibility reasons. From the documenation (Version c):

CLREX enables compatibility with other ARM Cortex processors that have to force the failure of the store exclusive if the exception occurs between a load exclusive instruction and the matching store exclusive instruction in a synchronization operation. In Cortex-M processors, the local exclusive access monitor clears automatically on an exception boundary, so exception handlers using CLREX are optional.

So, since Cortex-M processors clear the local exclusive access flag on exception/interrupt entry/exit, this negates most (all?) of the use cases for CLREX.

With regard to your third question, as others have mentioned you are correct in thinking that the global record is used in multi-core scenarios. There may still be use cases for CLREX on multi-core processors depending on the implementation defined effects on local/global flags.

I can see why there is confusion around this, as the initial version of the M7 documentation doesn't include these sentences (not to mention the various other versions of more generic documentation on the ARM website). Even now, I cannot even link to the latest revision. The page displays 'Version a' by default and you have to manually change the version via a drop down box (hopefully this will change in future).

Update

In response to comments, an additional documentation link for this. This is the part of the manual that describes the usage of these instructions outside of the specific instruction documentation (and also has been there since the first revision):

The processor removes its exclusive access tag if:

It executes a CLREX instruction.

It executes a STREX instruction, regardless of whether the write succeeds.

An exception occurs. This means the processor can resolve semaphore conflicts between different threads.

In a multiprocessor implementation:

Executing a CLREX instruction removes only the local exclusive access tag for the processor.

Executing a STREX instruction, or an exception, removes the local exclusive access tags for the processor.

Executing a STREX instruction to a Shareable memory region can also remove the global exclusive access tags for the processor in the system.

answered Dec 06 '22 19:12

Graeme

Related questions
                            
                                Intel x86 to ARM assembly conversion
                            
                                How to implement system call in ARM64?
                            
                                Can old ARM32 binary files be run on AARCH64 kernel?
                            
                                ARM assembly puzzle
                            
                                Why two vector table addresses on ARM?
                            
                                What's the differences between arm-eabi-gcc and arm-elf-gcc?
                            
                                What are ATAGs for a device that uses Linux Kernel?
                            
                                Building for ARM? Cannot do cross-compilation with `go install` when GOBIN is set
                            
                                "undefined reference to `__stat_time64'" when cross-compiling rust project on musl 1.2.0
                            
                                ARM Development on Linux [closed]
                            
                                Android CPU register names?
                            
                                Executing code from RAM in STM32
                            
                                Does mutex_unlock function as a memory fence?
                            
                                C Disassembly to ARMv6: Meaning of Dot (.) Before a Label
                            
                                Cross compilation requirements for C
                            
                                ARM : What's the difference between APCS and AAPCS ABI?
                            
                                Passing parameters and return values for a subroutine in assembly
                            
                                How does 'BL' arm instruction disassembly work?
                            
                                Storing CRC into an AXF/ELF file
                            
                                Valgrind in ARM Cortex-A8 Issue "configure: error: Unsupported host architecture"

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

When is CLREX actually needed on ARM Cortex M7?

Tags:

atomic

arm

cortex-m

load-link-store-conditional

Lou

People also ask