What is the right way to create a constant pool for inline assembly?

Tags:

The problem is that inside a C function I have an inline assembly. Something like

  ldr r7, =0xdeadbeef
  svc 0

If a literal pool wasn't created explicitly (this is the case), assembler creates one at the end of the translation unit. Usually this is fine, but if the translation unit turns out to be really huge, this doesn't work, because the literal pool is too far from the ldr instruction.

So, I wonder what is the best way to handle the problem. The most obvious way is to create a literal pool manually inside the inline assembly:

  ldr r7, =0xdeadbeef
  svc 0
  b 1f
  .ltorg
1:

  ldr r7, 1f
  svc 0
  b 2f
1:
  .word 0xdeadbeef
2:

Unfortunately, this leads to a suboptimal code because of the redundant branch instruction. I don't expect assembler to be clever enough to find an appropriate place for the constant pool inside the function. What I would like to do is to create a constant pool at the end of the function. Is there any way to tell the compiler (gcc) to create a literal pool at the end of the function?

PS I ended up using movw/movt pair instead of constant pools. Though, firstly, the movw/movt solution is slightly less portable than literal pools and, secondly, I simply wonder if it is possible to use constant pools in inline assembly both reliably and efficiently.

Update: So, what is the best way to handle the problem?

To force the toolchain to create a constant pool after the function one can put the function in a separate code section. It works because at the end of a translation unit assembler generates separate constant pools for each section.

Though, in fact, the best way is to avoid loading constants into registers in inline assembly at all. It's better to let the compiler do that. In my case I eventually wrote a code similar to

register int var asm("r7") = 0xdeadbeef;
asm volatile("svc 0\n" :: "r" (var));

444

asked Mar 03 '15 13:03

Nikolai

1 Answers

You can use -ffunction-sections and as per query on -ffunction-section, use a ld --gc-sections to remove unused code.

There is the obvious of splitting up the file.

A solution that should work is to use a naked function with an unused annotation as it is never called. Place a single .ltorg here and also put both functions in a special section; .text.ltorg_kludge for instance. The linker script should use .text* and functions in identical sub-sections are placed together. In some ways this is like splitting up the file as the compiler will try to inline static functions.

You may rely on the compiler emitting functions as encountered in the source without a special section. However, I am not sure if this is a standard or happen-stance. Compilers may optimize better by emitting function in some DAG ordering of the call hierarchy.

Aside: movw/movt is more efficient due to cache effects. It is also works with ARMv6 and above Thumb2 code. I don't see portability as a big deal (as inline assembler is non-portable and you probably prefer performance over portability), but the question is relevant to ARMv4/5 users.

I investigated the use of the R constraint from gcc machine constraints,

R
An item in the constant pool

However, a sample with gcc-4.8 gives an error impossible constraint. Using alternative letters like C also give the same error message. Inspection of the source contraints.md seems to indicate that the R constraint is a documentation only feature. Unfortunate, as it sounds purpose built to solve this issue.

It is possible to have the compiler load the value, but this maybe sub-optimal depending on the inline assembler. For example,

  asm(" add %0, %0, %1\n" : "+r" (0xdeadbeef) : "r" (0xbaddeed0));

104

answered Oct 25 '22 13:10

artless noise

Related questions
                            
                                Why short is 2-byte aligned?
                            
                                Is there a way to stop implicit pointer conversions to void *
                            
                                What is the difference between the __sync and __atomic intrinsics of gcc
                            
                                How to implement a timed loop?
                            
                                From a technical POV, what is it that MinGW does that makes gcc possible on windows?
                            
                                How to install IMUsim
                            
                                modules.usbmap and modules.pcimap missing on Ubuntu-based distro
                            
                                gcc - what is attribute nothrow used for?
                            
                                Round positive value half-up to 2 decimal places in C
                            
                                Edge Triggered epoll c
                            
                                how to convert negative hexadecimal to decimal
                            
                                How to implement a timeout in open/write function
                            
                                Var declare inside a printf can't be garbage collected by GCC
                            
                                C: Representation of Big Integers
                            
                                What is the difference between scope and linkage?
                            
                                Read Lua table from C
                            
                                Compiler Optimizations effect on FLOPs and L2/L3 Cache Miss Rate using PAPI
                            
                                Is auto; a valid C translation unit?
                            
                                accessing atomicly two scalar fields
                            
                                Strict aliasing in relation to aggregate or union types

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is the right way to create a constant pool for inline assembly?

Tags:

c

gcc

arm

inline-assembly

Nikolai

People also ask

1 Answers

artless noise

Recent Activity

Donate For Us