What are the tradeoffs of performing static analysis on source code, byte code, machine code, etc?

Tags:

What are the various tradeoffs for performing static analysis on various levels of code? For instance for Java, why would someone perform static analysis on Java source code vs. Jasmin code vs. Java bytecode? Does the choice restrict or expand the various types of analyses able to be done? Does the choice influence the correctness of the analyses? Thanks.

516

asked Oct 26 '11 10:10

ChaimKut

2 Answers

What are the various tradeoffs for performing static analysis on various levels of code? For instance for Java, why would someone perform static analysis on Java source code vs. Java bytecode?

From a user perspective, I'd say that, unless you have very specific, easy to formalize, properties to analyze (such as pure safety properties) go with a tool that supports Java source code.

From a tool-developer perspective, it may be easier to work with one level or another. I here present the differences that come to my mind. (Note that with a compiler and/or a decent decompiler a tool for instance operate on one layer and present the results on another.)

Pros for Java source code:

Structured language, i.e. loops etc, instead of arbitrary jumps. (This makes it a lot easier to create a weakest precondition calculus for instance.)
You can make more assumptions in the code (bytecode programs are more expressive).

Pros for Bytecode:

The language specification (the semantics of the bytecode instructions) are a lot simpler.
A more "pinned down" specification of the machine (the VM)
You can extend the analysis to legacy code and libraries.
Analysis allows for other languages targeting the JVM (Closure, Scala, JRuby...)
No need for a possibly complex parser

Pros for machine code:

You verify what you actually feed the CPU with. (No need to use a verified compiler or verified VM if you want a fully verified chain.)

State of the art tools such as Spec# etc (formal methods dialect of C#) usually go through an intermediate language (BoogiePL (neighter MSIL nor C#) in the Spec# case) specifically designed for formal analysis.

Does the choice restrict or expand the various types of analyses able to be done?

In the end... no, not really. You face the same fundamental problems regardless of which (Turing complete) language you choose to analyze. Depending on what properties you analyze, YMMV though.

If you're into formal methods and thinking about implementing an analysis yourself, I suspect you'll find better tool-support for bytecode. If you're a user or developer and want to perform analysis on your own code-base, I suspect you'll benefit more from tools operating on Java-source code level.

Does the choice influence the correctness of the analyses?

Depends on what you mean by correctness. A static analysis is most often "defensive" in the sense that you don't assume anything that you don't know is true. If you restrict your attention to sound verification systems, all of them will be "equally correct".

117

answered Oct 22 '22 05:10

aioobe

IntelliJ has static analysis for comments e.g. Javadoc and parameter names which is not available in the byte code. e.g. spelling mistakes and name inconsistencies. Analysis of code ensures you have line numbers and position within a line of any issue.

The benefit of analysing byte code is that its much simpler and may be all you need. You might have line numbers but you won't have the position. And you can analise compiled code which you don't have the source for, e.g. libraries.

answered Oct 22 '22 03:10

Peter Lawrey

Related questions
                            
                                How to keep argument names of interface after compilation in Java?
                            
                                SpringBoot upgrade RestTemplateBuilder from 1.5.14 to 2.1.5
                            
                                EnableBinding is deprecated in Spring Cloud Stream 3.x
                            
                                How to refer to the outer class in another instance of a non-static inner class?
                            
                                Generic Test harness for java.util.Map?
                            
                                How to handle different versions of xsd files in one java application?
                            
                                Recommend a Java / Android Game Library - card, board, etc.?
                            
                                What data-structure should I use to create my own "BigInteger" class?
                            
                                Implementing a concurrent LinkedHashMap
                            
                                Invoke Maven-Module Build from IntelliJ
                            
                                How does the scrolling on "Google Maps for Mobile" work?
                            
                                GWT: How to return (and handle) an error from multipart form (file) upload
                            
                                Extract bit sequences of arbitrary length from byte[] array efficiently
                            
                                JVM signal chaining SIGPIPE
                            
                                Import StartCom CA certificates in Windows JRE
                            
                                jvisualvm - automatically (JMX) reconnect to application?
                            
                                Record voice with Java
                            
                                BatchSqlUpdate - how to get auto generated keys
                            
                                Specifying alternatives in RPM dependencies
                            
                                How to eagerly load lazy fields with JPA 2.0?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What are the tradeoffs of performing static analysis on source code, byte code, machine code, etc?

Tags:

java

bytecode

static-code-analysis

ChaimKut

People also ask

2 Answers

aioobe

Peter Lawrey

Recent Activity

Donate For Us