Expression trees represent code in a tree-like data structure, where each node is an expression, for example, a method call or a binary operation such as x < y . You can compile and run code represented by expression trees.
The binary expression tree is a binary tree whose leaves are operands, such as constants or variable names, and the other nodes contain operators. For example, the postfix notation a b + c d e + * * results in the following expression tree.
Construction of Expression TreeRead one symbol at a time from the postfix expression. Check if the symbol is an operand or operator. If the symbol is an operand, create a one node tree and push a pointer onto a stack.
Construction of Expression Tree: We loop through input expression and do the following for every character. If a character is an operand push that into the stack. If a character is an operator pop two values from the stack make them its child and push the current node again.
The best explanation about expression trees I ever read is this article by Charlie Calvert.
To sum it up;
An expression tree represents what you want to do, not how you want to do it.
Consider the following very simple lambda expression:
Func<int, int, int> function = (a, b) => a + b;
This statement consists of three sections:
- A declaration:
Func<int, int, int> function
- An equals operator:
=
- A lambda expression:
(a, b) => a + b;
The variable
function
points at raw executable code that knows how to add two numbers.
This is the most important difference between delegates and expressions. You call function
(a Func<int, int, int>
) without ever knowing what it will do with the two integers you passed. It takes two and returns one, that's the most your code can know.
In the previous section, you saw how to declare a variable that points at raw executable code. Expression trees are not executable code, they are a form of data structure.
Now, unlike delegates, your code can know what an expression tree is meant to do.
LINQ provides a simple syntax for translating code into a data structure called an expression tree. The first step is to add a using statement to introduce the
Linq.Expressions
namespace:
using System.Linq.Expressions;
Now we can create an expression tree:
Expression<Func<int, int, int>> expression = (a, b) => a + b;
The identical lambda expression shown in the previous example is converted into an expression tree declared to be of type
Expression<T>
. The identifierexpression
is not executable code; it is a data structure called an expression tree.
That means you can't just invoke an expression tree like you could invoke a delegate, but you can analyze it. So what can your code understand by analyzing the variable expression
?
// `expression.NodeType` returns NodeType.Lambda.
// `expression.Type` returns Func<int, int, int>.
// `expression.ReturnType` returns Int32.
var body = expression.Body;
// `body.NodeType` returns ExpressionType.Add.
// `body.Type` returns System.Int32.
var parameters = expression.Parameters;
// `parameters.Count` returns 2.
var firstParam = parameters[0];
// `firstParam.Name` returns "a".
// `firstParam.Type` returns System.Int32.
var secondParam = parameters[1].
// `secondParam.Name` returns "b".
// `secondParam.Type` returns System.Int32.
Here we see that there is a great deal of information we can get from an expression.
But why would we need that?
You have learned that an expression tree is a data structure that represents executable code. But so far we have not answered the central question of why one would want to make such a conversion. This is the question we asked at the beginning of this post, and it is now time to answer it.
A LINQ to SQL query is not executed inside your C# program. Instead, it is translated into SQL, sent across a wire, and executed on a database server. In other words, the following code is never actually executed inside your program:
var query = from c in db.Customers where c.City == "Nantes" select new { c.City, c.CompanyName };
It is first translated into the following SQL statement and then executed on a server:
SELECT [t0].[City], [t0].[CompanyName] FROM [dbo].[Customers] AS [t0] WHERE [t0].[City] = @p0
The code found in a query expression has to be translated into a SQL query that can be sent to another process as a string. In this case that process happens to be a SQL server database. It is obviously going to be much easier to translate a data structure such as an expression tree into SQL than it is to translate raw IL or executable code into SQL. To exaggerate the difficulty of the problem somewhat, just imagine trying to translate a series of zeros and ones into SQL!
When it is time to translate your query expression into SQL, the expression tree representing your query is taken apart and analyzed, just as we took apart our simple lambda expression tree in the previous section. Granted, the algorithm for parsing the LINQ to SQL expression tree is much more sophisticated than the one we used, but the principle is the same. Once it has analyzed the parts of the expression tree, then LINQ mulls them over and decides the best way to write a SQL statement that will return the requested data.
Expression trees were created in order to make the task of converting code such as a query expression into a string that can be passed to some other process and executed there. It is that simple. There is no great mystery here, no magic wand that needs to be waved. One simply takes code, converts it into data, and then analyzes the data to find the constituent parts that will be translated into a string that can be passed to another process.
Because the query comes to the compiler encapsulated in such an abstract data structure, the compiler is free to interpret it in almost any way it wants. It is not forced to execute the query in a particular order, or in a particular way. Instead, it can analyze the expression tree, discover what you want done, and then decide how to do it. At least in theory, it has the freedom to consider any number of factors, such as the current network traffic, the load on the database, the current results sets it has available, etc. In practice LINQ to SQL does not consider all these factors, but it is free in theory to do pretty much what it wants. Furthermore, one could pass this expression tree to some custom code you write by hand which could analyze it and translate it into something very different from what is produced by LINQ to SQL.
Once again, we see that the expression trees allow us to represent (express?) what we want to do. And we use translators that decide how our expressions are getting used.
An expression tree is a mechanism to translate executable code into data. Using an expression tree, you can produce a data structure that represents your program.
In C#, you can work with the expression tree produced by lambda expressions by using the Expression<T>
class.
In a traditional program, you write code like this:
double hypotenuse = Math.Sqrt(a*a + b*b);
This code causes the compiler to generate an assignment, and that's it. In most cases, that's all you care about.
With conventional code, your application can't go retroactively back and look at hypotenuse
to determine that it was produced by performing a Math.Sqrt()
call; this information is simply not part of what is included.
Now, consider a lambda expression like the following:
Func<int, int, double> hypotenuse = (a, b) => Math.Sqrt(a*a + b*b);
This is a little different than before. Now hypotenuse
is actually a reference to a block of executable code. If you call
hypotenuse(3, 4);
you will get the value 5
returned.
We can use expression trees to explore the block of executable code that was produced. Try this instead:
Expression<Func<int, int, int>> addTwoNumbersExpression = (x, y) => x + y;
BinaryExpression body = (BinaryExpression) addTwoNumbersExpression.Body;
Console.WriteLine(body);
This produces:
(x + y)
More advanced techniques and manipulations are possible with expression trees.
Expression trees are an in-memory representation of an expression, e.g. an arithmetic or boolean expression. For example, consider the arithmetic expression
a + b*2
Since * has a higher operator precedence than +, the expression tree is built like that:
[+]
/ \
a [*]
/ \
b 2
Having this tree, it can be evaluated for any values of a and b. Additionally, you can transform it into other expression trees, for example to derive the expression.
When you implement an expression tree, I would suggest to create a base class Expression. Derived from that, the class BinaryExpression would be used for all binary expressions, such as + and * . Then you could introduce a VariableReferenceExpression to reference variables (such as a and b), and another class ConstantExpression (for the 2 from the example).
The expression tree is in many cases built as the result of parsing an input (from the user directly, or from a file). For evaluating the expression tree, I would suggest to use the Visitor pattern.
Short answer: It's nice to be able to write the same kind of LINQ query and point it at any data source. You couldn't have a "Language Integrated" query without it.
Long answer: As you probably know, when you compile source code, you're transforming it from one language to another. Usually from a high level language (C#) to a lower lever on (IL).
There are basically two ways you can do this:
The latter is what all the programs we know as 'compilers' do.
Once you have a parse tree you can easily translate it into any other language and this is what expression trees allow us to do. Since the code is stored as data you can do anything you want to it but probably you'll just want to translate it into some other language.
Now, in LINQ to SQL the expression trees get turned into a SQL command and then are sent over the wire to the database server. As far as I know they don't do anything really fancy when translating the code but they could. For instance, the query provider could create different SQL code depending on the network conditions.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With