In C, it is easy to manipulate memory and hardware registers, because concepts such as "address" and "volatile" are built into the language. Consequently, most OSs are written in the C family of languages. For example, I can copy an arbitrary function to an arbitrary location in memory, then call that location as a function (assuming the hardware doesn't stop me from executing data, of course; this would work on certain microcontrollers).
int hello_world()
{
printf("Hello, world!");
return 0;
}
int main()
{
unsigned char buf[1000];
memcpy(buf, (const void*)hello_world, sizeof buf);
int (*x)() = (int(*)())buf;
x();
}
However, I have been reading about the Open Genera operating system for certain dedicated Lisp machines. Wikipedia says:
Genera is written completely in Lisp; even all the low-level system code is written in Lisp (device drivers, garbage collection, process scheduler, network stacks, etc.)
I am completely new to Lisp, but this seems like a difficult thing to do: Common Lisp, from what I've seen, doesn't have good abstractions for the hardware it's running on. How would Common Lisp operating systems do something basic such as compile the following trivial function, write its machine code representation to memory, then call it?
(defun hello () (format t "Hello, World!"))
Of course, Lisp can be easily implemented in itself, but in the words of Sam Hughes, "somewhere down the line, abstraction runs out and a machine has to execute an instruction."
Lisp machines are general-purpose computers designed to efficiently run Lisp as their main software and programming language, usually via hardware support. They are an example of a high-level language computer architecture, and in a sense, they were the first commercial single-user workstations.
LISP, an acronym for list processing, is a programming language that was designed for easy manipulation of data strings. Developed in 1959 by John McCarthy, it is a commonly used language for artificial intelligence (AI) programming. It is one of the oldest programming languages still in relatively wide use.
LISP machines died because they came out near the end of the "expert systems" false boom, and the beginning of the "AI Winter". In the mid-1980s, the expert systems crowd were claiming strong AI was coming Real Soon Now.
The Lisp machine was a computer hardware with a CPU just like modern machines today, only the CPU had some special instructions that mapped better to LISP. It still was a stack machine and it compiled it's source to CPU instructions just as modern Common Lisp implementations do today on more general CPUs.
In the Lisp machines wikipedia page you can see how a function gets compiled:
(defun example-count (predicate list)
(let ((count 0))
(dolist (i list count)
(when (funcall predicate i)
(incf count)))))
(disassemble (compile #'example-count))
0 ENTRY: 2 REQUIRED, 0 OPTIONAL ;Creating PREDICATE and LIST
2 PUSH 0 ;Creating COUNT
3 PUSH FP|3 ;LIST
4 PUSH NIL ;Creating I
5 BRANCH 15
6 SET-TO-CDR-PUSH-CAR FP|5
7 SET-SP-TO-ADDRESS-SAVE-TOS SP|-1
10 START-CALL FP|2 ;PREDICATE
11 PUSH FP|6 ;I
12 FINISH-CALL-1-VALUE
13 BRANCH-FALSE 15
14 INCREMENT FP|4 ;COUNT
15 ENDP FP|5
16 BRANCH-FALSE 6
17 SET-SP-TO-ADDRESS SP|-2
20 RETURN-SINGLE-STACK
This is then stored in some memory place and when running this function it just jumps or calls to this. As with any assembly code the CPU gets instructed to continue running some other code when it's done running this and it may be the Lisp main loop itself (REPL).
The same code compiled with SBCL:
; Size: 203 bytes
; 02CB9181: 48C745E800000000 MOV QWORD PTR [RBP-24], 0 ; no-arg-parsing entry point
; 189: 488B4DF0 MOV RCX, [RBP-16]
; 18D: 48894DE0 MOV [RBP-32], RCX
; 191: 660F1F840000000000 NOP
; 19A: 660F1F440000 NOP
; 1A0: L0: 488B4DE0 MOV RCX, [RBP-32]
; 1A4: 8D41F9 LEA EAX, [RCX-7]
; 1A7: A80F TEST AL, 15
; 1A9: 0F8598000000 JNE L2
; 1AF: 4881F917001020 CMP RCX, 537919511
; 1B6: 750A JNE L1
; 1B8: 488B55E8 MOV RDX, [RBP-24]
; 1BC: 488BE5 MOV RSP, RBP
; 1BF: F8 CLC
; 1C0: 5D POP RBP
; 1C1: C3 RET
; 1C2: L1: 488B45E0 MOV RAX, [RBP-32]
; 1C6: 488B40F9 MOV RAX, [RAX-7]
; 1CA: 488945D8 MOV [RBP-40], RAX
; 1CE: 488B45E0 MOV RAX, [RBP-32]
; 1D2: 488B4801 MOV RCX, [RAX+1]
; 1D6: 48894DE0 MOV [RBP-32], RCX
; 1DA: 488B55F8 MOV RDX, [RBP-8]
; 1DE: 4883EC18 SUB RSP, 24
; 1E2: 48896C2408 MOV [RSP+8], RBP
; 1E7: 488D6C2408 LEA RBP, [RSP+8]
; 1EC: B902000000 MOV ECX, 2
; 1F1: FF1425B80F1020 CALL QWORD PTR [#x20100FB8] ; %COERCE-CALLABLE-TO-FUN
; 1F8: 488BC2 MOV RAX, RDX
; 1FB: 488D5C24F0 LEA RBX, [RSP-16]
; 200: 4883EC18 SUB RSP, 24
; 204: 488B55D8 MOV RDX, [RBP-40]
; 208: B902000000 MOV ECX, 2
; 20D: 48892B MOV [RBX], RBP
; 210: 488BEB MOV RBP, RBX
; 213: FF50FD CALL QWORD PTR [RAX-3]
; 216: 480F42E3 CMOVB RSP, RBX
; 21A: 4881FA17001020 CMP RDX, 537919511
; 221: 0F8479FFFFFF JEQ L0
; 227: 488B55E8 MOV RDX, [RBP-24]
; 22B: BF02000000 MOV EDI, 2
; 230: 41BBF0010020 MOV R11D, 536871408 ; GENERIC-+
; 236: 41FFD3 CALL R11
; 239: 488955E8 MOV [RBP-24], RDX
; 23D: E95EFFFFFF JMP L0
; 242: CC0A BREAK 10 ; error trap
; 244: 02 BYTE #X02
; 245: 19 BYTE #X19 ; INVALID-ARG-COUNT-ERROR
; 246: 9A BYTE #X9A ; RCX
; 247: L2: CC0A BREAK 10 ; error trap
; 249: 02 BYTE #X02
; 24A: 02 BYTE #X02 ; OBJECT-NOT-LIST-ERROR
; 24B: 9B BYTE #X9B ; RCX
NIL
Not quite as few instructions is it. When running this function that is the machine code that gets control and it gives the control back to the system since the return address is perhaps the REPL or next instruction just like with compiled C.
A special thing about lisps in general is that lexical closures need to be handled. In C when a call is done the variables don't exist anymore, but in Lisps it may return or store a function that use those variables at a later time and that is no longer in scope. This means variables need to be handled almost as inefficient as in interpreted code in compiled code, especially with a old compiler that doesn't do much optimization.
A C compiler does it fare translating as well or else what would be the reason for programming C than in assembly? The Intel x86 processors doesn't have support for arguments in procedure calls. It is emulated by the C compiler. The caller sets values on the stack and it has a cleanup where it undoes it afterward. looping constructs such as for and while doesn't exist. Only branch/jmp. Yes, in C you get a more feel for the underlying hardware but it really isn't the same as machine code. It only leaks more.
A Lisp implementation as OS can have features such as low level assembly instructions as lisp opcodes. Compilation would then be to translate everything to low level lisp, then it's a 1:1 from those to machince bytes.
An operating system with a c library and a c compiler together does the very same thing. It runs translation to machine code and can then run the code in itself. This is how Lisp systems are meant to work too so the only thing you need is the API to hardware that can be as low level as memory mapping I/O etc.
Even without abstraction lisp can emit assembler. See
But it can also be used to create a thin but powerful abstraction over machine code. See Henry Bakers's Comfy Compiler
Finally check SBCL VOP's (example), they allow you to control what assembly code. Altough with virtual registers, as this happens before register allocation.
You may find this post interesting, as it deals with how to emit assembly from SBCL.
Btw, even though you can write drivers and such in lisp, it is not a good idea to needlessly duplicate the effort, so even Lisp implementations in Lisp, like SBCL, have some C parts to allow interfacing with the OS.
These C header files, along with the C source and assembly files, are then used (figure 2) to produce the sbcl executable itself. The executable is as yet not useful; while it provides an interface to the operating system services, and a garbage collector
Taken from seccion 3.2 from SBCL: a Sanely-Bootstrappable Common Lisp
I haven't checked out how Mezzano works, feel free to dig into it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With