I found this recently: https://github.com/xoreaxeaxeax/movfuscator It seems to be contingent on the fact that <code>mov</code> is turing-complete. Is that true, and why?

Systems rarely come to be known as Turing Complete through direct proof of their capacity for all Turing-computable functions. Instead, they often are "analogized" to an existing system already known to be Turing Complete. The paper by Stephen Dolan which shows <code>mov</code> to be TC does this by constructing a simulation of a TM system in terms of <code>mov</code>. It follows that any question which could be posed to the TC system could, at worst, be virtualized within the <code>mov</code>-constructed simulation.

Why is mov turing complete?

Video Answer

2 Answers

Yes, x86's mov is Turing complete. I added that tag to your question because it may not be true for other ISAs with an instruction called mov, and the movfuscator compiler only targets x86.

It's not "mov" itself doing computation, it's x86 addressing modes which can do addition (and bit-shift). I haven't looked in detail at how it works, but it relies a lot on lookup tables, and stuff like mov eax, [base + eax*4] loading one of two possible values depending on EAX being 0 or 1.

Also remember that x86 mov has several forms: between memory and register (load, store, or reg<-reg), or immediate to memory or register. And with addressing modes including absolute and register indirect.

I don't think it relies on self-modifying code, but correct me if I'm wrong. (And if it does, I think/hope movfuscator at least won't create instructions other than mov. That would be cheating.)

Also, it's only sort of true; you need some way to loop the main program, assuming the original source isn't straight-line with no loops - the Movfuscator github readme talks about this:

While Dolan's paper required a jmp instruction, the M/o/Vfuscator does not - it uses a faulting mov instruction to achieve the infinite execution loop. If you're worried that this is still "jumping", the same effect could be achieved through pages aliased to the same address, wrapping execution around the upper range of memory, ring 0 exception handling, or simply repeating the mov loop indefinitely. A jmp is currently used to dispatch external functions - if this is a problem, avoid using external functions, or compile libraries with the M/o/Vfuscator as well.

When creating a user-space environment for mov-only code to run, I assume it creates a SIGSEGV handler (on POSIX OSes) that restarts execution from the top. So any faulting load can restart the main loop.

Letting execution wrap around is also mentioned as a possibility:

Wraparound of IP can work well in 16-bit mode where IP is 16-bit but CS:IP form a 20-bit linear address (real mode) or in 16-bit protected mode a 64k window somewhere in linear address space. i.e. you can have a 64k block of instructions in only part of your address space, with other space left for data. The DS segment can use a different base. (32-bit addressing modes in 16-bit mode are possible so you still have the full power of any register and scaled-index addressing mode.) Note that the mnemonic for reading and writing segment registers is also mov.

But harder in 32-bit mode where EIP is 32-bit and so are linear addresses after seg+off calculation. Unless there's some other trick, wraparound can only happen across the entire address space, regardless of what you do with segmentation. This leaves no non-code space for data. Setting a lower segment limit could make code-fetch fault, but that doesn't cause wraparound (unless you set a signal handler, or on bare metal set an interrupt handler address).

And either way, even x86-64 only has 64-bit pointers (in theory; 48 or 57-bit in practice), so space is finite, unlike a real Turing machine with infinite tape.

196

answered Sep 24 '22 08:09

Peter Cordes

The paper by Stephen Dolan which shows mov to be TC does this by constructing a simulation of a TM system in terms of mov. It follows that any question which could be posed to the TC system could, at worst, be virtualized within the mov-constructed simulation.

answered Sep 24 '22 08:09

ŹV -

Related questions
                            
                                assembly registers esp and ebp
                            
                                c library x86/x64 assembler
                            
                                What are the differences comparing PIE, PIC code and executable on 64-bit x86 platform?
                            
                                Why might one use the xzr register instead of the literal 0 on ARMv8?
                            
                                Why is it not possible to push a byte onto a stack on Pentium IA-32?
                            
                                How do I translate assembly to binary?
                            
                                How to write a hello world kernel?
                            
                                difference between conditional instructions (cmov) and jump instructions [duplicate]
                            
                                Clarification on Binary file (PE/COFF & ELF) formats & terminology
                            
                                Is there an x86 opcode for moving an immediate byte to a direct memory location (without using registers)?
                            
                                How do I find how C++ compiler implements something except inspecting emitted machine code?
                            
                                Size of store buffers on Intel hardware? What exactly is a store buffer?
                            
                                Is it possible to use LLVM-assembly directly?
                            
                                What is the best way to go about writing a simple x86 assembler?
                            
                                Are ARM instructuons SWI and SVC exactly same thing?
                            
                                How does including assembly inline with C code work?
                            
                                Push XMM register to the stack
                            
                                Why does the x86-64 GCC function prologue allocate less stack than the local variables?
                            
                                Assembling 32-bit binaries on a 64-bit system (GNU toolchain)
                            
                                How does $ work in NASM, exactly?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why is mov turing complete?

Tags:

computer-science

x86

assembly

turing-complete

mov

schuelermine

People also ask

Video Answer

2 Answers

Peter Cordes

ŹV -

Recent Activity

Donate For Us