The answer to the question Is it possible to create custom operators in JavaScript? is not yet, but @Benjamin suggested that it would be possible to add a new operator using third party tools:
It is possible to use third party tools like sweet.js to add custom operators though that'd require an extra compilation step.
I will take the same example, like in the previous question:
For any two real numbers x and y: x ∘ y is x + 2y that is also a real number. How can I add this operator in my extended JavaScript language?
After the following code will be run:
var x = 2
, y = 3
, z = x ∘ y;
console.log(z);
The output will contain
8
(because 8
is 2 + 2 * 3
)
How would I extend the JavaScript language to support a new operator?
Yes, it's possible and not even very hard :)
We'll need to discuss a few things:
If you're lazy and just want to see it in action - I put the working code on GitHub
Very generally - a language is composed of two things.
Syntax - these are the symbols in the language like unary operators like ++
, as well as Expression
s like a FunctionExpression
that represent an "inline" function. The syntax represents just the symbols used and not their meaning. In short the syntax is just the drawings of letters and symbols - it holds no inherent meaning.
Semantics ties meaning to these symbols. Semantics is what says ++
means "increment by one", in fact here is the exact defintion. It ties meaning to our syntax and without it the syntax is just a list of symbols with an order.
At some point, when something executes your code in JavaScript or any other programming language - it needs to understand that code. A part of this called lexing (or tokenizing, let's not go into subtle differences here) means breaking up code like:
function foo(){ return 5;}
Into its meaningful parts - that is saying that there is a function
keyword here, followed by an identifier, an empty arguments list, then a block opening {
containing a return keyword with the literal 5
, then a semicolon, then an end block }
.
This part is entirely in the syntax, all it does is break it up to parts like function,foo,(,),{,return,5,;,}
. It still has no understanding of the code.
After that - a Syntax Tree
is built. A syntax tree is more aware of the grammar but is still entirely syntactic. For example, a syntax tree would see the tokens of:
function foo(){ return 5;}
And figure out "Hey! There is a function declaration here!".
It's called a tree because it's just that - trees allow nesting.
For example, the code above can produce something like:
Program FunctionDeclaration (identifier = 'foo') BlockStatement ReturnStatement Literal (5)
This is rather simple, just to show you it isn't always so linear, let's check 5 +5
:
Program ExpressionStatement BinaryExpression (operator +) Literal (5) Literal(5) // notice the split her
Such splits can occur.
Basically, a syntax tree allows us to express the syntax.
This is where x ∘ y
fails - it sees ∘
and doesn't understand the syntax.
This just requires a project that parses the syntax. What we'll do here is read the syntax of "our" language which is not the same as JavaScript (and does not comply to the specification) and replace our operator with something the JavaScript syntax is OK with.
What we'll be making is not JavaScript. It does not follow the JavaScript specification and a standards complaint JS parser will throw an exception on it.
This we do all the time anyway :) All we'll do here is just define a function to call when the operator is called.
Let me just start by saying after this prefix that we'll not be adding an operator to JS here, rather - we're defining our own language - let's call it "CakeLanguage" or something and add the operator it it. This is because ∘
is not a part of the JS grammar and the JS grammar does not allow arbitrary operators like some other languages.
We'll use two open source projects for this:
It you paid close attention you'd know we can't use esprima directly since we'll be giving it grammar it does not understand.
We'll add a #
operator that does x # y === 2x + y
for the fun. We'll give it the precedence of multiplicity (because operators have operator precedence).
So, after you get your copy of Esprima.js - we'll need to change the following:
To FnExprTokens
- that is expressions we'll need to add #
so it'd recognize it. Afterwards, it'd look as such:
FnExprTokens = ['(', '{', '[', 'in', 'typeof', 'instanceof', 'new', 'return', 'case', 'delete', 'throw', 'void', // assignment operators '=', '+=', '-=', '*=', '/=', '%=', '<<=', '>>=', '>>>=', '&=', '|=', '^=', ',', // binary/unary operators '+', '-', '*', '/', '%','#', '++', '--', '<<', '>>', '>>>', '&', '|', '^', '!', '~', '&&', '||', '?', ':', '===', '==', '>=', '<=', '<', '>', '!=', '!=='];
To scanPunctuator
we'll add it and its char code as a possible case: case 0x23: // #
And then to the test so it looks like:
if ('<>=!+-*#%&|^/'.indexOf(ch1) >= 0) {
Instead of:
if ('<>=!+-*%&|^/'.indexOf(ch1) >= 0) {
And then to binaryPrecedence
let's give it the same precedence as multiplicity:
case '*': case '/': case '#': // put it elsewhere if you want to give it another precedence case '%': prec = 11; break;
That's it! We've just extended our language syntax to support the #
operator.
We're not done yet, we need to convert it back to JS.
Let's first define a short visitor
function for our tree that recursively visits all its node.
function visitor(tree,visit){ for(var i in tree){ visit(tree[i]); if(typeof tree[i] === "object" && tree[i] !== null){ visitor(tree[i],visit); } } }
This just goes through the Esprima generated tree and visits it. We pass it a function and it runs that on every node.
Now, let's treat our special new operator:
visitor(syntax,function(el){ // for every node in the syntax if(el.type === "BinaryExpression"){ // if it's a binary expression if(el.operator === "#"){ // with the operator # el.type = "CallExpression"; // it is now a call expression el.callee = {name:"operator_sharp",type:"Identifier"}; // for the function operator_# el.arguments = [el.left, el.right]; // with the left and right side as arguments delete el.operator; // remove BinaryExpression properties delete el.left; delete el.right; } } });
So in short:
var syntax = esprima.parse("5 # 5"); visitor(syntax,function(el){ // for every node in the syntax if(el.type === "BinaryExpression"){ // if it's a binary expression if(el.operator === "#"){ // with the operator # el.type = "CallExpression"; // it is now a call expression el.callee = {name:"operator_sharp",type:"Identifier"}; // for the function operator_# el.arguments = [el.left, el.right]; // with the left and right side as arguments delete el.operator; // remove BinaryExpression properties delete el.left; delete el.right; } } }); var asJS = escodegen.generate(syntax); // produces operator_sharp(5,5);
The last thing we need to do is define the function itself:
function operator_sharp(x,y){ return 2*x + y; }
And include that above our code.
That's all there is to it! If you read so far - you deserve a cookie :)
Here is the code on GitHub so you can play with it.
As I said in the comments of your question, sweet.js doesn't support infix operators yet. You're free to fork sweet.js and add it yourself, or you're simply SOL.
Honestly, it's not worth it to implement custom infix operators yet. Sweet.js is a well supported tool, and it's the only one I know of that tries to implement macros in JS. Adding custom infix operators with a custom preprocessor is probably not worth the gain you might have.
That said, if you're working on this alone for non-professional work, do whatever you want...
EDIT
sweet.js does now support infix operators.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With