I'm trying to disable/enable cache in Linux kernel space.
The code I use is
__asm__ __volatile__(
"pushw %eax\n\t" /*line 646*/
"movl %cr0,%eax\n\t"
"orl $0x40000000,%eax\n\t"
"movl %eax,%cr0\n\t"
"wbinvd\n\t"
"pop %eax");
After I compile, I got the error message as follows:
memory.c: Assembler messages:
memory.c:645: Error: operand type mismatch for `push'
memory.c:646: Error: unsupported for `mov'
memory.c:648: Error: unsupported for `mov'
memory.c:650: Error: operand type mismatch for `pop'
make[4]: *** [memory.o] Error 1
My machine is Intel(R) Xeon(R) CPU E5-1650 v2 @ 3.50GHz. 64bit machine.
Could anyone help me point out which part is incorrect and how I can fix it?
I'm guessing it's because the mismatch of the instruction and the register. But I'm confused at how to fix it. :(
Thanks in advance!
Although most 32bit registers persist into 64bit architectures, they are no longer capable of interacting with the stack. Therefore, trying to push or pop %eax
is an illegal operation. So if you wish to play with the stack, you must use %rax
, which is the 64bit architectures equivalent of %eax
.
The correct approach is to declare a clobber on %eax
, instead of saving/restoring it yourself. The compiler can probably do something more efficient than push/pop, like using different registers for any values that it wants to stay live. This also means you don't need different code for 64bit to save/restore %rax
instead.
Note that pushq %rax
/ popq %rax
would not be safe in user-space code on x86-64. There's no way to tell gcc that inline-asm clobbers the red-zone. It would be safe in kernel code, where the ABI doesn't use a red-zone, but again, it's still defeating the purpose of GNU C inline asm syntax.
There's an additional wrinkle here: mov %cr0, %eax
isn't a valid 64bit instruction. You have to use a 64bit register.
Letting the compiler pick a register for us solves this problem, and gives the compiler more freedom, so it's better anyway. Declare a C variable with a type that's 64bit in the x86-64 ABI, and 32bit in the i386 ABI. (e.g. long
, since this is for the Linux kernel ABI, not Windows where long
is always 32bit. uintptr_t
is another option that would work in the Linux kernel. (But not in user-space: x32 is long mode with 32bit pointers).)
// is this enable or disable? I didn't check the manual
void set_caching_x86(void) {
long tmp; // mov to/from cr requires a 64bit reg in 64bit mode
asm volatile(
"mov %%cr0, %[tmp]\n\t" // Note the double-% when we want a literal % in the asm output
"or $0x40000000, %[tmp]\n\t"
"mov %[tmp], %%cr0\n\t"
"wbinvd\n\t"
: [tmp] "=r" (tmp) // outputs
: // no inputs
: // no clobbers. "memory" clobber isn't needed, this just affects performance, not contents
);
}
This compiles and assembles to what we want, with or without -m32
, as you can see on the Godbolt Compiler Explorer.
When writing by hand, it's easier to let the operand-size be implied by the operands, instead of always using a suffix on the mnemonic. i.e. push %eax
would have worked (in 32bit mode), but still been worse than letting the compiler take care of it.
We could have used %k[tmp]
to get %eax
(or whatever) even in 64bit mode, but that would zero out the upper 32b. Spending 1 byte on a REX prefix for the or
instruction is worth it to be more future-proof for CPUs that might care what you write to the upper 32b of a control register.
The volatile
makes sure the asm statement isn't optimized away, even if the output value is never used.
There are several problems with your inline assembly statement, most of which are indicated by the error messages.
The first error message Error: operand type mismatch for `push'
, corresponds to the pushw %eax
instruction. The error is a result of the fact that the operand size suffix you used, w
, doesn't match the actual size of the operand, %eax
. You've told it to use the instruction for pushing a 16-bit value on the stack but provided a 32-bit register as an operand. You could fix that by using pushw %ax
but that's not what you want. It would preserve only the lower 16-bits of the RAX register, not the entire register.
Another "obvious" fix would be to use pushl %eax
, but there's two problems with that. First in order to fix other problems you need to modify the entire RAX register, and that means you need to preserve all 64 bits of it, not just the lower 32 bits. The second is that there is no 32-bit PUSH instruction in 64-bit mode, so your forced to use pushq %rax
regardless.
The next two error messages are both Error: unsupported for `mov'
. These error messages correspond to the movl %cr0,%eax
and movl %eax,%cr0
instructions. and both are a result of the same problem. In 64-bit mode there's no 32-bit operand size version of these instructions. You need to use a 64-bit operand, so the fix is simply to use RAX instead of EAX. This is where the entire 64-bits of RAX gets clobbered and why I said you needed to preserve the entire register.
The last error message is Error: operand type mismatch for `pop'
. This is a result of a similar problem as the first. In this case you haven't used a operand size suffix, which means that assembler will try to determine the operand size based on the operands. Since you've used a 32-bit operand, %eax
, it uses a 32-bit operand size. However just like with PUSH, there's 32-bit POP instruction in 64-bit mode, so you can't use %eax
either. In any case since the PUSH instruction needs to be 64-bit, the POP instruction needs to be 64-bit to match, so the fix is to use popq %rax
.
Finally one problem that isn't indicated by an error message is that in 64-bit mode the size of CR0 is extended to 64-bits. While the extra 32-bits are currently reserved and must be set to zero, they could be defined in future processors. So the orl $0x40000000,%eax
instruction should preserve the upper 64-bits. Unfortunately it doesn't, it will clear the upper 32-bit bits of RAX meaning that this instruction would also unintentionally clear any of those bits that future CPUs might give meaning to. So it should be replaced with orq $0x40000000,%rax
.
So the fixed sequence of instructions would be:
pushq %rax
movq %cr0, %rax
orq $0x40000000, %rax
movq %rax, %cr0
wbinvd
popq %rax
This isn't what I'm going to suggest using in your inline assembly however. It's possible to simplify it by letting GCC pick the register used. This way there's no need to preserve it. Here's what I would suggest instead:
long long dummy;
asm volatile ("movq %%cr0, %0\n\t"
"orq $0x40000000, %0\n\t"
"movq %0, %%cr0\n\t"
"wbinvd"
: "=r" (dummy) : :);
According to intel -- http://download.intel.com/products/processor/manual/325383.pdf A word is 16 bits so pushw is expecting a 16 bit operand. The register eax is 32 bits and must be pushed using pushl. Edit: Are you assembling for 32 bit or 64 bit?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With