Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Suggestions on syntax to express mathematical formula concisely

If you're going to be writing this for the ab-initio world (which I'm guessing from your MP2 equation) you want to make it very easy and clear to express things as close to the mathematical definition that you can.

For one, I wouldn't have the complicated range function. Have it define a loop, but if you want nested loops, specify them both:

So instead of

(range(i) < j < N)[T(i,j) = (T(i,j) - T(j,i))/e(i+j)];

use

loop(j,0,N)[loop(i,0,j)[T(i,j) = (T(i,j) - T(j,i))/e(i+j)]]

And for things like sum and product, make the syntax "inherit" from the fact that it's a loop.

So instead of

sum(range(i) < j < N))[(T(i,j) - T(j,i))/e(i+j)];

use

sum(j,0,n)[loop(i,0,j)[(T(i,j) - T(j,i))/e(i+j)]]

or if you need a double sum

sum(j,0,n)[sum(i,0,j)[(T(i,j) - T(j,i))/e(i+j)]]

Since it looks like you're trying to represent quantum mechanical operators, then try to make your language constructs match the operator on a 1-1 basis as closely as possible. That way it's easy to translate (and clear about what's being translated).

EDITED TO ADD

since you're doing quantum chemistry, then it's fairly easy (at least as syntax goes). You define operators that always work on what's to the right of them and then the only other thing you need are parenthesis to group where an operator stops.

Einstein notation is fun where you don't specify the indices or bounds and they're implied because of convention, however that doesn't make clear code and it's harder to think about.

For sums, even if the bounds implied, they're always easy to figure out based on the context, so you should always make people specify them.

sum(i,0,n)sum(j,0,i)sum(a,-j,j)sum(b,-i,i)....

Since each operator works to the right, its variables are known, so j can know about i, a can know about i and j and b can know about i,j, and a.

From my experience with quantum chemists (I am one too!) they don't like complicated syntax that differs much from what they write. They are happy to separate double and triple sums and integrals into a collection of singles because those are just shorthand anyway.

Symmetry isn't going to be that hard either. It's just a collection of swaps and adds or multiplies. I'd do something where you specify the operation which contains a list of the elements that are the same and can be swapped:

c2v(sigma_x,a,b)a+b

This says that a and b are can be considered identical particles under a c2v operation. That means that any equation with a and b (such as the a+b after it) should be transformed into a linear combination of the c2v transformations. the sigma_x is the operation in c2v that you want applied to your function, (a+b). If I remember correctly, that's 1/sqrt(2)((a+b)+(b+a)). But I don't have my symmetry book here, so that could be wrong.


I would prefer a more solid separation between loops. For example, I'd prefer this alternative notation to your second example:

sum(range(j) < N)[sum(range(i) < j)[(T(i,j) - T(j,i))/e(i+j)]]

I also find the range syntax difficult. I think a range should be a single component. I'd prefer something like so:

j = range_iter(0,N)

That would open for range x(0,N); j = range.begin(); or alternatives I can't think up right now.

You could even:

j = range_iter(inc(0) => exc(N)); for j iterates over [0,N).

Anyway, interesting idea. Can your resulting functions be composed? Can you request domain information?


If you're looking for simplicity you should take the implicitness of the loops even further. E.g., something like this

T( i < j , j < N ) = ( T(i,j) - T(j,i) )/e(i+j)

will work if you rewrite the assignment operator = to behave normally for something like a(i) = b(i) + c(i) but behave like a summation for a(i<5) = b(i) + c(i). Assume summation starts from 0 unless the lower limit is specified, e.g. a(3<i<5), check for symbolic upper/lower limits that appear as summation indices and make double sums as necessary. If you want the syntax to force explicitness you could define a separate sum operator s=

T( i < j , j < N ) s= ( T(i,j) - T(j,i) )/e(i+j)

I don't think you can get any cleaner than this and still have some general purpose usability. As for your short term goal, using the notion of specifying the summation index at the same time that the index first appears you could write.

E_MP2 s= EV( i < n1 , j < n2 , a < n3 , b < n4 ) *
         2 ( EV(a,b,i,j) - EV(a,b,j,i) ) / ( e(i)+e(j)-e(a)-e(b) )

where you explicitly state that this is a sum (using s=) making that operator then take the summation indices and limiting values from the first instance an index appears. Specifically you could also use a syntax like (assuming now a,b fixed and i,j as per your example)

E_MP2 s=(i<j,j<N) EV(i,j,a,b) *
                  2 ( EV(a,b,i,j) - EV(a,b,j,i) ) / ( e(i)+e(j)-e(a)-e(b) )

which is quite clear notationally.

You could then go on and take this concept even further by, e.g., defining an integration operator i= that does the same thing. I.e. it looks for instances of variables that are marked down with limits and then proceeds to integrate the expression with respect to those variables

F i=(0<x<Pi) x^-1 * exp(-I x^2)

similarly to the summation you could specify the limit when x first occurs

F i= x[0,Pi]^-1 * exp(-I x^2)

where the square brackets serve to differentiate the notation from summation, so that you don't have to use i= or s= and can use both summation and integration at the same time:

F(i) = G(i,j<10) * x[0,inf]^-1 * H(i,j,x)

I'm not familiar with Phoenix and can only make assumptions about it's capabilities.

I've always liked the way Haskell allows me to express ranges as a list comprehension. It translates nicely from the actual mathematical notation into the programming language construct.

[ i + j | i <- [1..10], j <- [1..10] ]

would end up as something like:

[ i + j | i = range(1,10), j = range(1,10) ]

Unfortunately I really don't know if something like this is even possible with Phoenix.