Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bytecode manipulation patterns

What legitimate uses are there for bytecode manipulation and how people implement those bytecode manipulation based solutions in practice?

Update: I should have made it more clear that this question really is about what patterns and techniques people use to make their code fly with the help of bytecode manipulation.

Something like aspect oriented programming that was already mentioned or building proxy objects on the fly and similar techniques.

like image 771
ahe Avatar asked Apr 23 '10 10:04

ahe


People also ask

What is bytecode manipulation?

Bytecode is the instruction set of the Java Virtual Machine (JVM), and all languages that run on the JVM must eventually compile down to bytecode. Bytecode is manipulated for a variety of reasons: Program analysis: find bugs in your application.

What is bytecode example?

An example of bytecode One of the most common examples of bytecode in action is the Java programming language. When an application is written in Java, the Java compiler converts the source code to bytecode, outputting the bytecode to a CLASS file.

How do you use ASM?

How do I start using ASM? If you want to use ASM to generate classes from scratch, write a Java source file that is representative of the classes you want to generate, compile it(*), and then run the ASMifier on the compiled class to see the Java source code that generates this class with ASM.

What is ASM jar used for?

ASM is one of these tools for the Java language, designed for runtime – but also offline – class generation and transformation. The ASM1 library was therefore designed to work on compiled Java classes. It was also designed to be as fast and as small as possible.


2 Answers

Bytecode manipulation lets you implement arbitrarily complex (and interesting) program transformations, such as:

  • entry/exit logging code for selected functions
  • security transformations that stub out access to certain API's
  • API substitution for, e.g., running code in a test harness.

The scope is endless; this is just a small sampling.

As for how this is typically done, start here.

like image 140
Marcelo Cantos Avatar answered Sep 30 '22 08:09

Marcelo Cantos


So, one can read bytecode to implement an interpreter / JVM. One can write / generate bytecode when implementing a Java compiler or a compiler for another language that will target the JVM (e.g. Scala and Jython). You might perform bytecode manipulation to optimize bytecode (if you want to produce and market a bytecode optimizer or you need it as an internal tool to give your company's code an edge over the competition). In a similar vein, you might manipulate bytecode in order to obfuscate it prior to distribution. You might also perform bytecode manipulation for aspect-oriented programming; for example, you might want to insert hooks (maybe for timing or logging purposes or for some other reason), and if it were simpler or less expensive to manipulate the bytecode than to edit all the source files (such as might be the case if the source code is unavailable or from many different sources, not all of which may be under one's control or for which it might be expensive and time-consuming to convince those teams to add such hooks), this might be a case where it would make sense to insert the modifications to the final bytecode output rather than to attempt to modify the original code (which might require upstreaming or maintaining a separate fork, or purchasing the source code from a third party that supplies only the bytecode).

You can manipulate bytecode yourself, although there are many existing open source libraries and frameworks to do it, including BCEL and ASM to name just a couple.

like image 24
Michael Aaron Safyan Avatar answered Sep 30 '22 07:09

Michael Aaron Safyan