Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Programming Language Properties that facilitate refactoring?

What are common traits/properties of programming languages that facilitate (simplify) the development of widely automated source code analysis and re-engineering (transformation) tools?

I am mostly thinking in terms of programming language features that make it easier to develop static analysis and refactoring tools (i.e. compare Java vs. C++, the former of which has better support for refactoring).

In other words, a programming language that would be explicitly designed to provide support for automated static analysis and refactoring right from the beginning, what characteristics would it preferably feature?

For example, for Ada, there's the ASIS:

The Ada Semantic Interface Specification (ASIS) is a layered, open architecture providing vendor-independent access to the Ada Library Environment. It allows for the static analysis of Ada programs and libraries. ASIS, the Ada Semantic Interface Specification, is a library that gives applications access to the complete syntactic and semantic structure of an Ada compilation unit. This library is typically used by tools that need to perform some sort of static analysis on an Ada program.

ASIS information: ASIS provides a standard way for tools to extract data that are best collected by an Ada compiler or other source code analyzer. Tools which use ASIS are themselves written in Ada, and can be very easily ported between Ada compilers which support ASIS. Using ASIS, developers can produce powerful code analysis tools with a high degree of portability. They can also save the considerable expense of implementing the algorithms that extract semantic information from the source program. For example, ASIS tools already exist that generate source-code metrics, check a program's conformance to coding styles or restrictions, make cross-references, and globally analyze programs for validation and verification.

Also see, ASIS FAQ

Can you think of other programming languages that provide a similarly comprehensive and complete interface to working with source code specifically for analysis/transformation purposes?

I am thinking about specific implementation techniques to provide the low level hooks, for example core library functions that provide a way to inspect an AST or ASG at runtime.

like image 354
none Avatar asked Jun 10 '09 18:06

none


4 Answers

The biggest has to be static typing. This allows tools to have much more insight into what the code is doing. Without it refactoring becomes many times more difficult.

like image 59
Bill K Avatar answered Nov 11 '22 00:11

Bill K


I think this is still a largely unexplored problem. The notion of "language design for tooling" seems to only have entered the fringes of the mainstream recently, though I think research in this area is more than two decades old. I agree with two of the other answers, namely that "static typing" and "self-similarity" are useful properties of a language to make refactoring support easier.

like image 36
Brian Avatar answered Nov 11 '22 00:11

Brian


It is true that the particular programming language can make analysis easier. If you want the easist-to-analyze languages, pick a purely functional one.

But nobody in practice programs in purely functional langauges. (The Haskell guys are going to jump up and down when they see this, but seriously, Haskell is used only extremely rarely).

What makes a programming language analyzable is infrastructure designed to support analysis. Ada's ASIS above, is a great example. Don't confuse the fact that ASIS was written for Ada, or is written in Ada; what counts is that somebody serious wanted to analyze Ada and invested the effort to build Ada analysis machinery.

I believe that the right cure is to build general analysis infrastructure and amortize it across lots of languages. While we're at it, we should build general transformation infrastructure, too, because once you have an analysis, you'll want to use it to effect change. (Doctor visits don't end with diagnosis; they end with cures). And I've bet my career on it.

The result is an engine I think ideal for analysis, refactoring, reengineering, etc: our DMS Software Engineering Toolkit.

It has generic parsing, tree building, prettyprinting, tree manipulation, source-to-source rewriting, attribute grammar evaluations, control and data flow analysis. It has production quality front ends for a number of widely used dialects of C and C++, for Java, C#, COBOL, and PHP, and even for Verilog and VHDL (many other langauges too, but not quite at that level).

To give you some sense of its utility, it was used to convert JOVIAL code for the B-2 bomber into C... without us ever having seen the source code. See http://www.semdesigns.com/Products/Services/NorthropGrummanB2.html

Now, assuming one has analysis infrastructure, what language features help?

Static types helps by limiting the set of possible values a variable can take, but only by adding a limited single-argument predicate, e.g., "X is an integer". I think what helps more are assertions in the code because they capture predicates with more than one argument, which establish relationships between state variables, that often cannot be found by inspecting the code (e.g., problem or domain specific information, e.g., "X > Y+3".) The analysis infrastructure (and frankly, the programmers that read the code) can ideally take advantage of such additional facts to provide a more effective analysis.

Such assertions are commonly coded with special keywords such as "assert", "pre(condition" and "post(condition" that are inspired with good reason from the theorem proving literature.

But even if you don't have assertions in your language, they are easy to encode anyway: just write an if statement with the condition containing the assertion denial, and the body doing something that calls an idiom indicating impossibility or violates the language semantics (e.g., deref an obviously null pointer), such as "if (x>0) fail();"

So what's really needed isn't assertions in the language, but programmers who are willing to write them. Alas, that seems to be sadly lacking.

like image 21
Ira Baxter Avatar answered Nov 11 '22 02:11

Ira Baxter


Reflection built into the language/type system. This makes static analysis and refactoring much less painful.

This is part of why Java and .NET tools are so commonplace and nice. This provides the tools with much better functionality in terms of understanding depdencies of source code quickly and reliably, which helps with the static analysis of source code.

In addition, you get the ability to do analysis of your compiled code, as well.

like image 40
Reed Copsey Avatar answered Nov 11 '22 02:11

Reed Copsey