First of all, I am doing this for fun so don't judge me.
What I did is passing a function pointer from user space to kernel, copy the function body using copy_from_user to an static array in kernel and start jumping in that array to execute.
in kernel:
static char handler_text[PAGE_SIZE] __page_aligned_data;
copy_from_user((void *)handler_text , (const void __user *)my_handler , PAGE_SIZE);
((void (*)())(handler_text))();
in user space, what this function does is very simple as follows
void my_handler(){
volatile unsigned long * p = (volatile unsigned long *)0xF0000c10;
*p = 0x0000000;
}
10000938 <my_handler>:
10000938: 3d 20 f0 00 lis r9,-4096
1000093c: 39 40 00 00 li r10,0
10000940: 61 29 0c 10 ori r9,r9,3088
10000944: 91 49 00 00 stw r10,0(r9)
10000948: 4e 80 00 20 blr
1000094c: 00 01 88 08 .long 0x18808
The problem is the first time I do this always generates a Oops. But the second time I do this and there after, the problem is gone and there is no Oops any more. I can clearly see the function is executed by kernel by reading the memory. I am running a PowerPc target so Oops shows the exception is 700, which is program exception. From the Oops, I can see the instruction dump, where the nip (after) is exactly the same instruction as my_handler.
Instruction dump:
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 <3d20f000> 39400000 61290c10 91490000
I couldn't figure any sense out of it. Can anyone? Thanks
You can use the copy_from_user() and copy_to_user() functions to move data between kernel space and user space.
You cannot call a kernel function from user space, you need to go through one of the existing mechanisms. You'll need to write at least some glue code in the kernel, to provide a way to trigger the execution of the kernel code, and pass parameters and results around.
System Call Interfaces (SCI) are the only way to transit from User space to kernel space.
User space programs cannot access system resources directly so access is handled on the program's behalf by the operating system kernel. The user space programs typically make such requests of the operating system through system calls. Kernel threads, processes, stack do not mean the same thing.
I hate to discourage an admirable notion, but what you're trying to do is difficult, if not impossible, without some serious extra work.
Your function is linked at location F
in user space. You're copying it to kernel space at the location of the static array: A
. A
is probably in the kernel's data section, so execution may not be possible. Also, your function is linked at the wrong location (e.g. F != A
).
Further, even if your function could link to the correct location A
, how are you handling the relocation of symbols within it (e.g. If it calls printk
, how are you relinking the address inside the function to match the actual printk
address)?
It is much easier to create a kernel module and load that (via modprobe
) and you can do whatever you want.
Side note: This is a huge security vulernability. A similar one was used by the "Stuxnet" worm to penetrate Windows.
UPDATE:
The dump occurs [in time] long after the exception event. By that time, it has the correct data, so the dump shows the current state, so to speak, but not what happened on the exact cycle in question [due to the nature of this "self-modifying" code].
But, when initially executed it may have had garbage (i.e. the 700). I'm not sure about PPC, but other arches have separate inst and data caches. With out-of-order execution. The data would be in the data cache, but not necessarily in the inst cache [or queue]. And, they tend to operate independently for speed ["Harvard" architecture].
(e.g.) On x86, after setting the static area, you must flush/synchronize so that exec unit refetches the area. Otherwise, it may have already speculatively prefetched the instruction data (e.g. it isn't expecting it to be "self-modifying") with data that isn't what is expected [probably 0x00000000].
Consider: After the copy_from_user
the desired data is in the data cache, but has not yet been flushed to RAM. The execution unit [and inst cache], not having any data from the static area, will fetch from RAM. Because self-modifying code is rare, the inst and data caches do not snoop each other [it would slow things down].
So, the execution unit got its data from RAM (e.g. 0x00000000) instead of the loaded data [which is only in the data cache].
The second time works because the data fetched by the execution unit comes from the data during the first attempt [which has had time to flush to RAM]. That is, the static area has now been populated and the second copy_from_user
is, effectively, a NOP.
A "post-mortem" dump of the area [as mentioned] would not be able to show this discrepancy.
Figured it out. It turned out to be the cache thing. Thanks both Ctx and Craig, I added a
flush_dcache_icache_page(virt_to_page((unsigned long)(handler_text)));
after
copy_from_user((void *)handler_text , (const void __user *)my_handler , PAGE_SIZE);
And it is all good now. Before I asked the question, I tried just flush_dcache_page and it didn't work. So I have to flush both dcache and icache to make this work. Thanks again.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With