Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reference on Dalvik or Java Virtual Machines?

I am looking into Dalvik bytecode currently but as I lack a compiler background, I am finding it a little hard to grasp the design. I am pretty sure no one has written a book on Dalvik (or I could be wrong) so can someone suggest me a reference on the Java VM that contains some hands-on examples? Specifically, what I am interested in are:

  • Understand how to interpret the generated byte-code
  • Using VM specifications (Dalvik or Java) to decompile the byte code into an intermediate representation and then compile it back

In short, probably what I am looking for is to learn reverse engineering byte code so that I can analyze it for vulnerabilities. Any suggestions?

like image 385
Legend Avatar asked Jan 29 '11 20:01

Legend


People also ask

What is Dalvik virtual machine used for?

Dalvik is a discontinued process virtual machine (VM) in Android operating system that executes applications written for Android. (Dalvik bytecode format is still used as a distribution format, but no longer at runtime in newer Android versions.)

Why DVM is used instead of JVM?

One of the main reasons of using DVM in android is because it follows the register based model and it is much faster than stack based model while JVM follows the stack based model which takes a lot of memory and also slower than DVM.

Which type is the file run by Dalvik virtual machine?

Dalvik Virtual Machine uses its own byte-code and runs “. dex”(Dalvik Executable File) file.


2 Answers

For reference material, nothing beats the dalvik docs. You can find these either in the dalvik sub-project in AOSP, or they are now available online at http://s.android.com/tech/dalvik/index.html

Bytecode format (or dalvik-bytecode.html in the dalvik project) is probably the one that you would be most interested in. .Dex Format (dex-format.html) is also useful, as is Instruction Formats (instruction-formats.html)

For some more general information about the bytecode, check out http://code.google.com/p/smali/wiki/Registers and http://code.google.com/p/smali/wiki/TypesMethodsAndFields

You'll definitely want a few tools. I'm naturally quite partial to smali/baksmali, which is the only assembler/disassembler pair currently available. There is also a disassembler called dedexer (but no assembler), and dexdump, which comes with the AOSP codebase and provides a low-level dump of dex files - not just the bytecode, but all the dex structures as well (baksmali has similar output, with the -D option).

You might also be interested in apktool, which uses smali/baksmali, but also has the ability to reverse the "compiled" xml files in an apk.

There are a couple of tools out there that convert dalvik bytecode back to java bytecode, although I don't think they're 100% yet - undx and dex2jar

like image 152
JesusFreke Avatar answered Oct 16 '22 06:10

JesusFreke


There are already tools for reverse engineering .dex files to generate a human-readable representations of the byte codes. One of the most popular is baksmali, which you can find here: http://code.google.com/p/smali/.

A description of the byte codes themselves can be found easily by Googling. Here was the third results: http://www.netmite.com/android/mydroid/dalvik/docs/dalvik-bytecode.html.

If you are reverse engineering layouts, you'll need a binary-xml-to-xml converter as well. There's another stack overflow question that mentions a few tools for doing that: Parse versionCode from android apk files

like image 29
Jason LeBrun Avatar answered Oct 16 '22 06:10

Jason LeBrun