Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What can be the cause of "jal" to the middle of another function in MIPS

I am looking at a very suspicious disassembled MIPS code of a C application

80019B90                 jal     loc_80032EB4

loc_80032EB4 is in the middle of another function's body, I've specially checked that no other code is loaded at this address in runtime and calling that function this way(passing some code in the beginning) can be useful. But how is it possible to do in C? It's not a goto as you can't goto to another function and normal function call will always "jal" to the beginning. Can this be some hand optinmimzation?

Update:

Simplified layout of both functions, callee:

sub_80032E88 (lz77_decode)
... save registers ...
80032E90                 addiu   $sp, -8
... allocate memory for decompressed data ...
80032EB0                 move    DECOMPRESSED_DATA_POINTER_A1, $v0
loc_80032EB4:
80032EB4                 lw      $t7, 0(PACKED_DATA_POINTER_A0)
... actual data decompression ...
80032F4C                 jr      $ra

caller:

80019ACC                 addiu   $sp, -0x30
... some not related code ...
80019B88                 lw      $a1, off_80018084   // A predefined buffer is used instead of allocating it for decompressed data
80019B90                 jal     loc_80032EB4
80019B94                 move    $a0, $s0
... some other code and function epilogue ...

Update 2: I've checked if this can be a case of setjmp/longjmp usage, but in my tests I can always see calls to setjmp and longjmp functions in disassembled code, not a direct jump.

Update 3: I've tried using GCC-specific ability to get label pointers and casted this pointer to function, result is close to what I want but disassembled code is still different as instead of using jal with exaxct address it calculating it runtime, maybe I am just unable to force compiler to see this value as constant, becouse of scope issues.

like image 657
Riz Avatar asked Nov 04 '22 17:11

Riz


1 Answers

Since it is a data decompression function from a game system, it is very likely that this function is hand optimized assembly with multiple entry points. Multiple entry points aren't commonly used, so it is difficult to find a publicly available example, but here is an old thread from the gcc mailing list that suggests a possible use for this technique.

The gist is that if you have two functions where one function F1 has code that is a subset of the other function, F2's code, then the code for F2 can fall through into the code for F1. In your case, F2 allocates memory for the decompressed data, and F1 assumes that the memory allocation has already been done. I'm pretty sure that GCC 2.9x cannot generate code like this.

It is not possible to directly translate this construct from assembler into standard C, because you cannot goto another function in C, but this is perfectly legal in assembler code. The gcc mailing list thread suggests a couple of work-arounds to express the same idea in C.

If you look at the dis-assembled code for the decompression it will likely have a different style than compiler generated code. There may even be some use of opcodes, like find first set bit that the compiler cannot generate from C.

like image 60
markgz Avatar answered Dec 18 '22 08:12

markgz