When using 64bit Steel Bank Common Lisp on Windows for a trivial identity function:
(defun a (x)
(declare (fixnum x))
(declare (optimize (speed 3) (safety 0)))
(the fixnum x))
I find the disassembly is given as:
* (disassemble 'a)
; disassembly for A
; Size: 13 bytes
; 02D7DFA6: 84042500000F20 TEST AL, [#x200F0000] ; safepoint
; no-arg-parsing entry point
; AD: 488BE5 MOV RSP, RBP
; B0: F8 CLC
; B1: 5D POP RBP
; B2: C3 RET
I understand that the lines:
mov rsp, rbp
pop rbp
ret
perform standard return from function operations, but I don't understand why there are the lines:
TEST AL, [#x200F0000] // My understanding is that this sets flags based on bitwise and of AL and contents of memory 0x200F0000
and
CLC // My understanding is that this clears the carry flag.
As the disassembler hints, the TEST
instruction is a safepoint. It's used for synchronizing threads for the garbage collector. Safepoints are inserted in places where the compiler knows the thread is in a safe state for garbage collection to occur.
The form of the safepoint is defined in compiler/x86-64/macros.lisp:
#!+sb-safepoint
(defun emit-safepoint ()
(inst test al-tn (make-ea :byte :disp sb!vm::gc-safepoint-page-addr)))
You are of course correct about the result of the operation not being used. In this case, SBCL is interested in a side effect of the operation. Specifically, if the page containing the address happens to be protected, the instruction generates a page fault. If the page is accessible, the instruction just wastes a very small amount of time. I should point out this is probably much, much, faster than simply checking a global variable.
On Windows, the C functions map_gc_page
and unmap_gc_page
in runtime/win32-os.c are used to map and unmap the page:
void map_gc_page()
{
DWORD oldProt;
AVER(VirtualProtect((void*) GC_SAFEPOINT_PAGE_ADDR, sizeof(lispobj),
PAGE_READWRITE, &oldProt));
}
void unmap_gc_page()
{
DWORD oldProt;
AVER(VirtualProtect((void*) GC_SAFEPOINT_PAGE_ADDR, sizeof(lispobj),
PAGE_NOACCESS, &oldProt));
}
Unfortunately I haven't been able to track down the page fault handler, but the general idea seems to be that when a collection is needed, unmap_gc_page
will be called. Each thread will continue running until it hits one of these safepoints, and then a page fault occurs. Presumably the page fault handler would then pause that thread, and then when all threads have been paused, garbage collection runs, and then map_gc_page
is called again and the threads are allowed to resume.
The credits file honors Anton Kovalenko with introducing this mechanism.
On Linux and Mac OS X, a different synchronization mechanism is used by default, which is why the instruction isn't generated on default builds for those platforms. (I'm not sure if the PowerPC ports use safepoints by default, but obviously they don't use x86 instructions).
On the other hand, I have no idea about the CLC
instruction.
I know nothing about TEST AL, [#x200F0000]
, but I believe that CLC
is for functions that return one value. SBCL Internals Manual, "Unknown-Values Returns", suggests that functions set the carry flag if they return multiple values, or clear the carry flag if they return one value.
I am running SBCL 1.1.14 with OpenBSD and x86-64. I can see CLC
and SEC
if I disassemble a function that returns one value, and a function that returns multiple values:
CL-USER> (disassemble (lambda () 100))
; disassembly for (LAMBDA ())
; Size: 16 bytes
; 04B36F64: BAC8000000 MOV EDX, 200 ; no-arg-parsing entry point
; 69: 488BE5 MOV RSP, RBP
; 6C: F8 CLC
; 6D: 5D POP RBP
; 6E: C3 RET
; 6F: CC0A BREAK 10 ; error trap
; 71: 02 BYTE #X02
; 72: 19 BYTE #X19 ; INVALID-ARG-COUNT-ERROR
; 73: 9A BYTE #X9A ; RCX
NIL
This one has CLC
(clear carry) because it returns one value.
CL-USER> (disassemble (lambda () (values 100 200)))
; disassembly for (LAMBDA ())
; Size: 35 bytes
; 04B82BD4: BAC8000000 MOV EDX, 200 ; no-arg-parsing entry point
; D9: BF90010000 MOV EDI, 400
; DE: 488D5D10 LEA RBX, [RBP+16]
; E2: B904000000 MOV ECX, 4
; E7: BE17001020 MOV ESI, 537919511
; EC: F9 STC
; ED: 488BE5 MOV RSP, RBP
; F0: 5D POP RBP
; F1: C3 RET
; F2: CC0A BREAK 10 ; error trap
; F4: 02 BYTE #X02
; F5: 19 BYTE #X19 ; INVALID-ARG-COUNT-ERROR
; F6: 9A BYTE #X9A ; RCX
NIL
This one has STC
(set carry) because it returns two values.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With