Expression Trees-Lambdas to CodeDom Conversion

Introduction

Some people are working to make the meta-programming possible. Some says as language oriented programming or domain specific language, but I prefer in general as meta-programming. For years programming languages supported to generate code with the powerful libraries or developers worked just with string concatenations and external linkers.

Nowadays meta-programming is getting more and more important as the domain expertise required. So the languages make meta programming possible at the compiler level with compiler directives.

Indeed there a lot of ideas coming from functional programming world where everything treated as expressions.  The code becomes data and data usage happens in the code. It should sound familiar with LINQ to SQL efforts to make this possible.

Libraries

.NET Framework had code generators since the beginning. CodeDom is probably the best known for tree based code generation. Codedom made possible to develop the ASP.NET engine, Windows Form designer, Web form designer, Web services wrapper LINQ entity objects and more. It is used extensively by the framework for the key technologies.

Although there are other APIs in .NET framework such as System.Reflection, System.Reflection.Emit, in this post we will focus on CodeDom and the new comer Expression Trees.

Expression Tree is the key API behind LINQ to SQL or IQueryable interface in general. Every query is expressed as typed trees that is parsed and converted to SQL later by the library.

The syntax of expressing queries is very readable with query comprehension syntax. However sometimes I want to know about the generated tree, like actually which functions are getting involved in the query. I have used Expression Tree Debugger Visualizer to draw the tree. It is pretty handy tool but for big trees it is difficult to see what is going on. This was my main motivation actually, although we had the code, we don’t see what’s the magic going on with query comprehension.

Implementation

So the idea is to have the code regenerated from the tree. In the real world this will involve a parser, interpreter and some more compiler theory which requires a lot of research. And because this is just for fun and since we have a powerful CodeDom library to generate code, I tried to convert the expression tree to CodeDom tree. Than used the CodeDom to generate code in any language. Finally I wrote the extension methods so that the debuggers and my code can use it directly from the type.

The compiler generates automatically the expression trees if we use the proper syntax. So from the beginning we have the tree. In order to convert to CodeDom objects, we need to traverse the tree and generate the necessary CodeDom objects. So I wrote a  tree walker that generates a CodeDom object to is parent while going to the last children. I didn’t realise how far it is going but that was it. When the tree walker finished with some more few lines of code the converter was just working.

I would like to put the code here as well but unfortunately it is too long for a blog post, so here are some snippets. Feel free to provide suggestions or bug reports.


Example

The extension methods enables to see the source code of any IQueryable and any Expression. Any of them have a GenerateSourceCodeMethod that gives back a string.

Expression Tree to CodeDom Visualizer

GenerateSourceCode(); // default C#

GenerateSourceCode(string language); // either cs or vb as input or  Fully qualified name of the CodeDomProvider (like Microsoft.FSharp.Compiler.CodeDom.FSharpCodeProvider) It should be added as a reference to the project if you’re going to use it.

Sample program that manipulates the expression trees and usage of CodeDom Converter with “item.GetCodeDomSource(”vb”)”

int a = 3, c = 2, d = 0;
 
var e1 = Expression.Constant(5);
var e2 = Expression.And(e1, e1);
Expression<Func<string, Func<bool>>> e3 = tbool => () => a < b && 8 > d || c == d;
Expression<Func<bool>> e4 = () => b < 4;
Expression<Func<RecordName, bool>> e5 = rn => rn.LastName == "ALFKI";
Expression<Func<StringBuilder>> e6 = () => new StringBuilder { Capacity = 20 };
Expression<Func<string, string>> e7 = word => word == "hello" ? "yes" : "no";
 
 
foreach (var item in new Expression[] { e1, e2, e3, e4, e5, e6,e7 })
{
    Console.WriteLine(item.GetCodeDomSource("vb"));
}

Visual Basic Output

Namespace Runtime
 
    Public Class LambdaExpression
 
        Private LastName As String
 
        Private Sub New()
            MyBase.New
            Me.Func`2
        End Sub
 
        Private Function Func`2(ByVal rn As Demo.Program.RecordName) As Boolean
            Return (LastName Is "ALFKI")
        End Function
    End Class
End Namespace

C# Output

namespace Runtime {
    using System;
 
 
    public class LambdaExpression {
 
        private string LastName;
 
        private LambdaExpression() {
            this.Func`2;
        }
 
        private bool Func`2(Demo.Program.RecordName rn) {
            return (LastName == "ALFKI");
        }
    }
}

Conclusion

Codedom is too much C# centric, so it’s hard to make it available for every language. The difference between Code Statement and Code Expressions sometimes makes it hard to convert from expression trees.

On the on the other hand Expression trees are too much LINQ oriented. They are less powerful than CodeDom but more easy to express. In expression trees everything is an expression unlike CodeDom. Some constructs are missing from expression trees like the assignment, but we will probably see the improvements in the expression trees in the future. So it might not be a true DSL or language generator, but sure it is enough to get the most of the databases.

There are some other more powerful meta-programming tools and libraries. F# quotation library supports all the available full-set language features expressed as quotations. Dynamic Language Runtime is another expression tree like library focussed more on compiler developers.

Finally this library is not build for runtime code conversion from expression tree to CodeDom, although it is possible. The CodeDom generated code is mainly for debugging to print the source code of the query. It might also be helpful for seeing what is going on under the hood.

Leave a comment