In numerical oriented languages (Matlab, Fortran) range operator and semantics is very handy when working with multidimensional data. For example:
A(i:j,k,:n) // represents two-dimensional slice B(i:j,0:n) of A at index k
unfortunately C++ does not have range operator (:). of course it can be emulated using range/slice functor, but semantics is less clean than Matlab. I am prototyping matrix/tensor domain language in C++ and am wondering if there any options to reproduce range operator. I still would like to rely on C++/prprocessor framework exclusively.
So far I have looked through boost wave which might be an suitable option.
is there any other means to introduce new non-native operators to C++ DSL?
I know you cannot add new operators.am specifically looking for workaround. One thing I came up (very ugly hack and I do not intend to use):
#define A(r) A[range(((1)?r), ((0)?r))] // assume A overloads []
A(i:j); // abuse ternary operator
A solution that I've used before is to write an external preprocessor that parses the source and replaces any uses of your custom operator with vanilla C++. For your purposes, a : b
uses would be replaced with something like a.operator_range_(b)
, and operator:()
declarations with declarations of range_ operator_range_()
. In your makefile you then add a rule that preprocesses source files before compiling them. This can be done with relative ease in Perl.
However, having worked with a similar solution in the past, I do not recommend it. It has the potential to create maintainability and portability issues if you do not remain vigilant of how source is processed and generated.
No -- you can't define your own operators in C++. Bjarne Stroustrup details why..
As Billy said, you cannot overload operators. However, you can come very close yo what you want with "regular" operator overloading (and maybe some template metaprogramming). It would be quite easy to allow for something like this:
#include <iostream>
class FakeNumber {
int n;
public:
FakeNumber(int nn) : n(nn) {}
operator int() const { return n; }
};
class Range {
int f, t;
public:
Range(const int& ff, const int& tt) : f(ff), t(tt) {};
int from() const { return f; }
int to() const { return t; }
};
Range operator-(const FakeNumber& a, const int b) {
return Range(a,b);
}
class Matrix {
public:
void operator()(const Range& a, const Range& b) {
std::cout << "(" << a.from() << ":" << a.to() << "," << b.from() << ":" << b.to() << ")" << std::endl;
}
};
int main() {
FakeNumber a=1,b=2,c=3,d=4;
Matrix m;
m(a-b,c-d);
return 0;
}
The downside is that This solution doesn't support all-literal expressions. Either from or to have to be user-defined classes, since we can't overload operator- for two primitive types.
You can also overload operator*
to allow specifying stepping, like so:
m(a-b*3,c-d); // equivalent to m[a:b:3,c:d]
And overload both versions of operator--
to allow ignoring one of the bounds:
m(a--,--d); // equivalent to m[a:,:d]
Another option is to define two objects, named something like Matrix::start and Matrix::end, or whatever you like, and then instead of using operator--
, you could use them, and then the other bound wouldn't have to be a variable and could be a literal:
m(start-15,38-end); // This clutters the syntax however
And you could of course use both ways.
I think it's pretty much the best you can get without resorting to bizarre solutions, such as custom prebuild tools or macro abuse (of the sort Matthieu presented and suggested against using them:)).
An alternative is to build a C++ variant dialect using a program transformation tool.
The DMS Software Reengineering Toolkit is a program transformation engine, with an industrial strength C++ Front End. DMS, using this front end, can parse full C++ (it even has a preprocessor and can retain most preprocessor directives unexpanded), automatically build ASTs and complete symbol tables.
The C++ front end comes in source, using a grammar derived directly from the standard. It is technically straightforward to add new grammar rules including those that would allow ":" syntax as array subscripts as you have described, and as Fortran90+ has implemented. One can then use the program transformation capability of DMS to transform the "new" syntax into "vanilla" C++ for use in conventional C++ compilers. (This scheme is a generalization of the Intentional Programming model of "add DSL concepts to your language").
We in fact did a concept demonstration of "Vector C++" using this approach.
We added a multidimensional Vector datatype, whose storage semantics are only that array elements are distinct. This is different than C++'s model of sequential locations, but you need this different semantic if you want the compiler/transformer to have freedom to lay out memory arbitrarily, and this is fundamental if you want to use SIMD machine instructions and/or efficient cache accesses along different axes.
We added Fortran-90 style scalar and subarray range accesses, added virtually all of F90's array-processing operations, added a good fraction of APL's matrix operations, all by adjusting the DMS C++ grammar.
Finally, we built two translators using DMS transformational capability: one mapping a significant part of this (remember, this was a concept demo) to vanilla C++ so you could compile and run Vector C++ applications on a typical workstation, and the other mapping C++ to a PowerPC C++ dialect with SIMD instruction extensions, and we generated SIMD code that was pretty reasonable we thought. Took us about 6 man-months to do all this.
The customer for this ultimately bailed out (his business model didn't include supporting a custom compiler in spite of his severe need for parallel/SIMD based operations), and it has been languishing on the shelf. We've chosen not to pursue this in the broader market because it isn't clear what the market really is. I'm pretty sure there are organizations for which this would be valuable.
Point is, you really can do this. It is almost impossible using ad hoc methods. It is technically quite straightforward with a strong enough program transformation system. It isn't a walk in the park.
The easiest solution is to use a method on matrix instead of an operator.
A.range(i, j, k, n);
Note that typically you do not use ,
in a subscript operator []
, eg A[i][j]
instead of A[i,j]
. The second form could be possible by overloading the comma operator but then you force i
and j
to be objects not numbers.
You could define a range
class that could be used as a subscript for your matrix class.
class RealMatrix
{
public:
MatrixRowRangeProxy operator[] (int i) {
return operator[](range(i, 1));
}
MatrixRowRangeProxy operator[] (range r);
// ...
RealMatrix(const MatrixRangeProxy proxy);
};
// A generic view on a matrix
class MatrixProxy
{
protected:
RealMatrix * matrix;
};
// A view on a matrix of a range of rows
class MatrixRowRangeProxy : public MatrixProxy
{
public:
MatrixColRangeProxy operator[] (int i) {
return operator[](range(i, 1));
}
MatrixColRangeProxy operator[] (const range & r);
// ...
};
// A view on a matrix of a range of columns
class MatrixColRangeProxy : public MatrixProxy
{
public:
MatrixRangeProxy operator[] (int i) {
return operator[](range(i, 1));
}
MatrixRangeProxy operator[] (const range & r);
// ...
};
Then you can copy a range from one matrix into another.
RealMatrix A = ...
RealMatrix B = A[range(i,j)][range(k,n)];
Finally by creating a Matrix
class that can hold either a RealMatrix
or a MatrixProxy
you can make a RealMatrix
and a MatrixProxy
appear the same from the outside.
Note the operator[]
on the proxies are not and cannot be virtual.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With