If I want to access the VGA text buffer in X86 which is located at address 0xb8000:
uint16_t *VGA_buffer = (uint16_t*)0xb8000;
Then I index the variable VGA_buffer
as a normal array, i.e., VGA_buffer[0]
, VGA_buffer[1]
, etc.
However, I read about memory map in x86, the addresses listed there are physical addresses.
My question is:
How does the CPU access this address? Does the CPU knows that any address written explicitly in the code is a physical address and shall not pass by address translation mechanisms (logical address --> virtual address --> to physical address)?
Thanks in advance.
If you want to access a specific physical address while paging is enabled, map that physical address into virtual memory somewhere. If you're running under an existing OS, this is something you have to ask the OS to do for you.
How you ask the OS to do this for you is of course OS-specific.
For example, on Linux you could do this with an mmap()
system call on /dev/mem
, which is a special device file that gives access to the entire physical address space. See the mem(4)
man page. Anything you do with /dev/mem
is actually handled by the kernel device driver functions; it's just an API for letting you map physical memory. See also How does mmap'ing /dev/mem work despite being from unprivileged mode? (You need to be root, and even then it's just mapping memory, not running in kernel mode where you could run instructions like lidt
).
This superuser answer mentions that Linux's CONFIG_STRICT_DEVMEM
restricts it to only actual device memory, and is often enabled in real kernels.
So for example:
int fd = open("/dev/mem", O_RDWR);
volatile uint16_t *vgabase = mmap(NULL, 256 * 1024,
PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0xb8000);
close(fd);
// TODO: error checking on system-call return values.
// Run strace ./a.out to see what happens (not recommended with an X server running...)
vgabase[1] = 'a' + (0x07<<8); // lightgrey-on-black
http://wiki.osdev.org/VGA_Hardware#Video_Memory_Layout says that VGA memory is up to 256kiB, so I mapped it all. Note that the 0xb8000
is used as an offset into /dev/mem
. That's how you tell the kernel which physical memory you want to map. You can also use /dev/mem
with read/write or pread
/pwrite
system calls, e.g. to blit a buffer into physical memory at a given position.
Instead of just uint16_t*
, you might define a struct for text mode:
struct vgatext_char {
char c;
union { // anonymous union so you can do .fg or .color
struct {uint8_t fg:4,
bg:4;
};
uint8_t color;
};
};
// you might want to use this instead of uint16_t,
// or with an anonymous union of this and uint16_t.
Does the CPU knows that any address written explicitly in the code is a physical address and shall not pass by address translation mechanisms
All load/store instructions will treat addresses as virtual. Even if the compiler wanted to do something different, it couldn't. x86 has no "store-physical" instruction that bypasses address-translation and paging permission checks.
Remember that the CPU runs machine code produced by the compiler. At that point, there's no distinction left between addresses that appeared as integer constants in the C source vs. addresses of string constants. (e.g. puts("Hello World");
might compile to mov edi,0x4005c4
/ call puts
).
e.g. look at how this function compiles:
#include <stdio.h>
int foo() {
puts("hello world");
char *p = 0xb8000;
puts(p);
return 0;
}
In the compiler's asm output (from gcc -O3 for x86-64 Linux, on Godbolt), we see this:
sub rsp, 8
mov edi, OFFSET FLAT:.LC0 # address of the string constant
call puts
mov edi, 753664 # 0xB8000
call puts
xor eax, eax # return 0;
add rsp, 8
ret
I passed it to puts
just to illustrate that absolutely nothing is different in how a pointer that comes from an integer constant is handled. By the time we get to machine code (linker output), the label referring to the string-constant's address has been compiled to an immediate constant, just like the 0xB8000
: disassembly output from the same compiler-explorer link:
sub rsp,0x8
mov edi,0x4005d4 # address of the string constant
call 400410 <puts@plt>
mov edi,0xb8000
call 400410 <puts@plt>
xor eax,eax
add rsp,0x8
ret
It's only after addresses are mapped to physical that the hardware checks to see whether it's regular DRAM, MMIO, or device memory. (This happens in the system agent on Intel CPUs, on chip in the CPU, but outside of an individual core).
And for DRAM it also checks what memory type is in use: WB (write-back), USWC (uncacheable speculative write-combining), or UC (uncacheable) or others. VGA memory is normally USWC, so writing to it one char at a time is slow, and so is reading it. Use movnt
stores and movntdqa
loads to efficiently access whole blocks.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With