Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generating classes automatically from unit tests?

I am looking for a tool that can take a unit test, like

IPerson p = new Person();
p.Name = "Sklivvz";
Assert.AreEqual("Sklivvz", p.Name);

and generate, automatically, the corresponding stub class and interface

interface IPerson         // inferred from IPerson p = new Person();
{
    string Name 
    { 
        get;              // inferred from Assert.AreEqual("Sklivvz", p.Name);
        set;              // inferred from p.Name = "Sklivvz";
    }
}

class Person: IPerson     // inferred from IPerson p = new Person();
{
    private string name;  // inferred from p.Name = "Sklivvz";

    public string Name    // inferred from p.Name = "Sklivvz";
    {
        get
        {
            return name;  // inferred from Assert.AreEqual("Sklivvz", p.Name);
        }
        set
        {
            name = value; // inferred from p.Name = "Sklivvz";
        }
    }

    public Person()       // inferred from IPerson p = new Person();
    {
    }
}

I know ReSharper and Visual Studio do some of these, but I need a complete tool -- command line or whatnot -- that automatically infers what needs to be done. If there is no such tool, how would you write it (e.g. extending ReSharper, from scratch, using which libraries)?

like image 735
Sklivvz Avatar asked Sep 18 '08 10:09

Sklivvz


2 Answers

What you appear to need is a parser for your language (Java), and a name and type resolver. ("Symbol table builder").

After parsing the source text, a compiler usually has a name resolver, that tries to record the definition of names and their corresponding types, and a type checker, that verifies that each expression has a valid type.

Normally the name/type resolver complains when it can't find a definition. What you want it to do is to find the "undefined" thing that is causing the problem, and infer a type for it.

For

 IPerson p = new Person();

the name resolver knows that "Person" and "IPerson" aren't defined. If it were

 Foo  p =  new Bar();

there would be no clue that you wanted an interface, just that Foo is some kind of abstract parent of Bar (e.g., a class or an interface). So the decision as which is it must be known to the tool ("whenever you find such a construct, assume Foo is an interface ..."). You could use a heuristic: IFoo and Foo means IFoo should be an interface, and somewhere somebody has to define Foo as a class realizing that interface. Once the tool has made this decision, it would need to update its symbol tables so that it can move on to other statements:

For

 p.Name = "Sklivvz";

given that p must be an Interface (by the previous inference), then Name must be a field member, and it appears its type is String from the assignment.

With that, the statement:

 Assert.AreEqual("Sklivvz", p.Name);

names and types resolve without further issue.

The content of the IFoo and Foo entities is sort of up to you; you didn't have to use get and set but that's personal taste.

This won't work so well when you have multiple entities in the same statement:

 x = p.a + p.b ;

We know a and b are likely fields, but you can't guess what numeric type if indeed they are numeric, or if they are strings (this is legal for strings in Java, dunno about C#). For C++ you don't even know what "+" means; it might be an operator on the Bar class. So what you have to do is collect constraints, e.g., "a is some indefinite number or string", etc. and as the tool collects evidence, it narrows the set of possible constraints. (This works like those word problems: "Joe has seven sons. Jeff is taller than Sam. Harry can't hide behind Sam. ... who is Jeff's twin?" where you have to collect the evidence and remove the impossibilities). You also have to worry about the case where you end up with a contradiction.

You could rule out p.a+p.b case, but then you can't write your unit tests with impunity. There are standard constraint solvers out there if you want impunity. (What a concept).

OK, we have the ideas, now, can this be done in a practical way?

The first part of this requires a parser and a bendable name and type resolver. You need a constraint solver or at least a "defined value flows to undefined value" operation (trivial constraint solver).

Our DMS Software Reengineering Toolkit with its Java Front End could probably do this. DMS is a tool builder's tool, for people that want to build tools that process computer langauges in arbitrary ways. (Think of "computing with program fragments rather than numbers").

DMS provides general purpose parsing machinery, and can build an tree for whatever front end it is given (e.g., Java, and there's a C# front end). The reason I chose Java is that our Java front end has all that name and type resolution machinery, and it is provided in source form so it can be bent. If you stuck to the trivial constraint solver, you could probably bend the Java name resolver to figure out the types. DMS will let you assemble trees that correspond to code fragments, and coalesce them into larger ones; as your tool collected facts for the symbol table, it could build the primitive trees.

Somewhere, you have to decide you are done. How many unit tests the tool have to see before it knows the entire interface? (I guess it eats all the ones you provide?). Once complete, it assembles the fragments for the various members and build an AST for an interface; DMS can use its prettyprinter to convert that AST back into source code like you've shown.

I suggest Java here because our Java front end has name and type resolution. Our C# front end does not. This is a "mere" matter of ambition; somebody has to write one, but that's quite a lot of work (at least it was for Java and I can't imagine C# is really different).

But the idea works fine in principle using DMS.

You could do this with some other infrastructure that gave you access to a parser and an a bendable name and type resolver. That might not be so easy to get for C#; I suspect MS may give you a parser, and access to name and type resolution, but not any way to change that. Maybe Mono is the answer?

You still need a was to generate code fragments and assemble them. You might try to do this by string hacking; my (long) experience with gluing program bits together is that if you do it with strings you eventually make a mess of it. You really want pieces that represent code fragments of known type, that can only be combined in ways the grammar allows; DMS does that thus no mess.

like image 83
Ira Baxter Avatar answered Sep 22 '22 23:09

Ira Baxter


Its amazing how no one really gave anything towards what you were asking.

I dont know the answer, but I will give my thoughts on it.

If I were to attempt to write something like this myself I would probably see about a resharper plugin. The reason I say that is because as you stated, resharper can do it, but in individual steps. So I would write something that went line by line and applied the appropriate resharper creation methods chained together.

Now by no means do I even know how to do this, as I have never built anything for resharper, but that is what I would try to do. It makes logical sense that it could be done.

And if you do write up some code, PLEASE post it, as I could find that usefull as well, being able to generate the entire skeleton in one step. Very useful.

like image 45
mattlant Avatar answered Sep 22 '22 23:09

mattlant