Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are the gaps between symbolic execution and taint analysis?

I recently read a paper titling "All You Ever Wanted to Know about Dynamic Taint Analysis and Forward Symbolic Execution (but Might Have Been Afraid to Ask)" by Dr. EJ Schwartz. In the paper, he mainly talked about their applications in binary level security context.

I'm curious about the exact differences between dynamic taint analysis and forward symbolic execution.

From what I can see, taint analysis tracks information flows from object x(source) to object y(sink), whenever information stored in x is transferred to object y. So the major concern is what object can be transitively affected by the source. While symbolic execution treats some inputs as symbolic values and tries to express other variables with symbolic ones; thereby it answers on what conditions the symbolic input affects the succeeding programs.

I can see that at the binary level, taint analysis is often mentioned with the vulnerability caused by return address overwritten; while symbolic execution can deal with more types of vulnerable issues such as integer overflow, runtime assertion errors, resource leak (e.g., memory leak, file open/close), buffer overflow.

However it seems that modern taint analysis does not only involves data flow analysis, most of them will track the control flow conditions; and in several vulnerability detection scenarios the tainted input is also represented as the symbolic value and is propagated like the way symbolic execution does. On the other side, symbolic execution engines cannot fully uses symbolic values separated by different path conditions due to the limitations of the underlying constraint solvers and the execution/interpretation runtime; thereby they cannot achieve the high branch or path coverage as expected.

So in general cases, can we say that taint analysis is a kind of coarse symbolic execution, or symbolic execution is a kind of precise taint analysis?

like image 364
Hongxu Chen Avatar asked Mar 02 '15 14:03

Hongxu Chen


People also ask

What is taint analysis used for?

Taint analysis is a process used in information security to identify the flow of user input through a system to understand the security implications of the system design.

Is Symbolic execution dynamic analysis?

Symbolic execution is an interesting technique that falls somewhere in between static and dynamic analysis and is generally applied as a fully automatic approach.

What is static taint analysis?

A static taint analysis uses a parser to traverse abstract syntax trees of the source code. A web PHP parser has 140 grammar combinations in an abstract syntax tree, which has to be traversed to recognize the tainted flow pattern.


1 Answers

Interesting question! Here's my 2 cents: symbolic execution uses a kind of taint analysis to construct path constraints. Symbolic execution also employs an SMT/SAT solver to generate concrete values for variables and/or inputs, such that a certain path constraint is satisfied.

Since taint analysis does not employ an SMT/SAT solver, I would say it is not a kind of symbolic execution. Maybe one could say taint analysis is a part of symbolic execution.

This is just an opinion. Please feel free to challenge it.

like image 196
Benny Avatar answered Oct 08 '22 15:10

Benny