Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding the SBCL entry/exit assembly boiler plate code

BACKGROUND

When using 64bit Steel Bank Common Lisp on Windows for a trivial identity function:

(defun a (x)
   (declare (fixnum x)) 
   (declare (optimize (speed 3) (safety 0))) 
  (the fixnum x))

I find the disassembly is given as:

* (disassemble 'a)

; disassembly for A
; Size: 13 bytes
; 02D7DFA6:       84042500000F20   TEST AL, [#x200F0000]      ; safepoint
                                                              ; no-arg-parsing entry point
;       AD:       488BE5           MOV RSP, RBP
;       B0:       F8               CLC
;       B1:       5D               POP RBP
;       B2:       C3               RET

I understand that the lines:

mov rsp, rbp
pop rbp
ret  

perform standard return from function operations, but I don't understand why there are the lines:

TEST AL, [#x200F0000]  // My understanding is that this sets flags based on bitwise and of AL and contents of memory 0x200F0000

and

CLC // My understanding is that this clears the carry flag.

QUESTIONS

  1. Why does SBCL generate a test instruction, but never use the flags?
  2. Why does SBCL clear the carry flag before returning from a function?
like image 616
Peter de Rivaz Avatar asked Feb 18 '14 16:02

Peter de Rivaz


2 Answers

As the disassembler hints, the TEST instruction is a safepoint. It's used for synchronizing threads for the garbage collector. Safepoints are inserted in places where the compiler knows the thread is in a safe state for garbage collection to occur.

The form of the safepoint is defined in compiler/x86-64/macros.lisp:

#!+sb-safepoint
(defun emit-safepoint ()
  (inst test al-tn (make-ea :byte :disp sb!vm::gc-safepoint-page-addr)))

You are of course correct about the result of the operation not being used. In this case, SBCL is interested in a side effect of the operation. Specifically, if the page containing the address happens to be protected, the instruction generates a page fault. If the page is accessible, the instruction just wastes a very small amount of time. I should point out this is probably much, much, faster than simply checking a global variable.

On Windows, the C functions map_gc_page and unmap_gc_page in runtime/win32-os.c are used to map and unmap the page:

void map_gc_page()
{
    DWORD oldProt;
    AVER(VirtualProtect((void*) GC_SAFEPOINT_PAGE_ADDR, sizeof(lispobj),
                        PAGE_READWRITE, &oldProt));
}

void unmap_gc_page()
{
    DWORD oldProt;
    AVER(VirtualProtect((void*) GC_SAFEPOINT_PAGE_ADDR, sizeof(lispobj),
                        PAGE_NOACCESS, &oldProt));
}

Unfortunately I haven't been able to track down the page fault handler, but the general idea seems to be that when a collection is needed, unmap_gc_page will be called. Each thread will continue running until it hits one of these safepoints, and then a page fault occurs. Presumably the page fault handler would then pause that thread, and then when all threads have been paused, garbage collection runs, and then map_gc_page is called again and the threads are allowed to resume.

The credits file honors Anton Kovalenko with introducing this mechanism.

On Linux and Mac OS X, a different synchronization mechanism is used by default, which is why the instruction isn't generated on default builds for those platforms. (I'm not sure if the PowerPC ports use safepoints by default, but obviously they don't use x86 instructions).

On the other hand, I have no idea about the CLC instruction.

like image 179
Samuel Edwin Ward Avatar answered Oct 24 '22 07:10

Samuel Edwin Ward


I know nothing about TEST AL, [#x200F0000], but I believe that CLC is for functions that return one value. SBCL Internals Manual, "Unknown-Values Returns", suggests that functions set the carry flag if they return multiple values, or clear the carry flag if they return one value.

I am running SBCL 1.1.14 with OpenBSD and x86-64. I can see CLC and SEC if I disassemble a function that returns one value, and a function that returns multiple values:

CL-USER> (disassemble (lambda () 100))
; disassembly for (LAMBDA ())
; Size: 16 bytes
; 04B36F64:       BAC8000000       MOV EDX, 200               ; no-arg-parsing entry point
;       69:       488BE5           MOV RSP, RBP
;       6C:       F8               CLC
;       6D:       5D               POP RBP
;       6E:       C3               RET
;       6F:       CC0A             BREAK 10                   ; error trap
;       71:       02               BYTE #X02
;       72:       19               BYTE #X19                  ; INVALID-ARG-COUNT-ERROR
;       73:       9A               BYTE #X9A                  ; RCX
NIL

This one has CLC (clear carry) because it returns one value.

CL-USER> (disassemble (lambda () (values 100 200)))
; disassembly for (LAMBDA ())
; Size: 35 bytes
; 04B82BD4:       BAC8000000       MOV EDX, 200               ; no-arg-parsing entry point
;       D9:       BF90010000       MOV EDI, 400
;       DE:       488D5D10         LEA RBX, [RBP+16]
;       E2:       B904000000       MOV ECX, 4
;       E7:       BE17001020       MOV ESI, 537919511
;       EC:       F9               STC
;       ED:       488BE5           MOV RSP, RBP
;       F0:       5D               POP RBP
;       F1:       C3               RET
;       F2:       CC0A             BREAK 10                   ; error trap
;       F4:       02               BYTE #X02
;       F5:       19               BYTE #X19                  ; INVALID-ARG-COUNT-ERROR
;       F6:       9A               BYTE #X9A                  ; RCX
NIL

This one has STC (set carry) because it returns two values.

like image 36
George Koehler Avatar answered Oct 24 '22 08:10

George Koehler