So started making progress on LuvvieScript and then it all kicked off a bit on Twitter... https://twitter.com/gordonguthrie/status/389659700741943296
Anthony Ramine https://twitter.com/nokusu made the point that I was doing it wrong and I should be compiling from Erlang to JavaScript via Core Erlang and not the Erlang AST. This is both an compelling yet unattractive option for me... Twitter not being the right medium for that discussion I thought I would write it up here and get some advice on that.
LuvvieScript has three core requirements:
The third of these options is kinda out of scope for this debate but the first two are core.
There is a lazy-gits corollary - I want to use as many Erlang and Javascript syntax tools (lexers, parser, tokenizers, AST transforms, etc, etc, etc) as possible and write the smallest amount of code.
The way the code is currently written as the following structure:
Basically I get an Erlang AST that looks something like this:
[{function,
{19,{1,9}},
atom1_fn,0,
[{clause,
{19,none},
[],
[[]],
[{match,
{20,none},
[{var,{20,{5,6}},'D'}],
[{atom,{20,{11,15}},blue}]},
{var,{21,{5,6}},'D'}]}]}]},
and I then transpose it into a Javascript JSON AST that looks like:
{
"type": "Program",
"body": [
{
"type": "VariableDeclaration",
"declarations": [
{
"type": "VariableDeclarator",
"id": {
"type": "Identifier",
"name": "answer",
"loc": {
"start": {
"line": 2,
"column": 4
},
"end": {
"line": 2,
"column": 10
}
}
},
"init": {
"type": "BinaryExpression",
"operator": "*",
"left": {
"type": "Literal",
"value": 6,
"raw": "6",
"loc": {
"start": {
"line": 2,
"column": 13
},
"end": {
"line": 2,
"column": 14
}
}
},
"right": {
"type": "Literal",
"value": 7,
"raw": "7",
"loc": {
"start": {
"line": 2,
"column": 17
},
"end": {
"line": 2,
"column": 18
}
}
},
"loc": {
"start": {
"line": 2,
"column": 13
},
"end": {
"line": 2,
"column": 18
}
}
},
"loc": {
"start": {
"line": 2,
"column": 4
},
"end": {
"line": 2,
"column": 18
}
}
}
],
"kind": "var",
"loc": {
"start": {
"line": 2,
"column": 0
},
"end": {
"line": 2,
"column": 19
}
}
}
],
"loc": {
"start": {
"line": 2,
"column": 0
},
"end": {
"line": 2,
"column": 19
}
}
}
Anthony's point is well made - Core Erlang is a simplified and more regular language than Erlang and should be more easily transpiled to Javascript than plain Erlang, but it is not very well documented.
I can get an AST like representation of Core Erlang easily enough:
{c_module,[],
{c_literal,[],basic_types},
[{c_var,[],{atom1_fn,0}},
{c_var,[],{atom2_fn,0}},
{c_var,[],{bish_fn,1}},
{c_var,[],{boolean_fn,0}},
{c_var,[],{float_fn,0}},
{c_var,[],{int_fn,0}},
{c_var,[],{module_info,0}},
{c_var,[],{module_info,1}},
{c_var,[],{string_fn,0}}],
[],
[{{c_var,[],{int_fn,0}},{c_fun,[],[],{c_literal,[],1}}},
{{c_var,[],{float_fn,0}},{c_fun,[],[],{c_literal,[],2.3}}},
{{c_var,[],{boolean_fn,0}},{c_fun,[],[],{c_literal,[],true}}},
{{c_var,[],{atom1_fn,0}},{c_fun,[],[],{c_literal,[],blue}}},
{{c_var,[],{atom2_fn,0}},{c_fun,[],[],{c_literal,[],'Blue 4 U'}}},
{{c_var,[],{string_fn,0}},{c_fun,[],[],{c_literal,[],"string theory"}}},
{{c_var,[],{bish_fn,1}},
{c_fun,[],
[{c_var,[],'_cor0'}],
{c_case,[],
{c_var,[],'_cor0'},
[{c_clause,[],
[{c_literal,[],bash}],
{c_literal,[],true},
{c_literal,[],berk}},
{c_clause,[],
[{c_literal,[],bosh}],
{c_literal,[],true},
{c_literal,[],bork}},
{c_clause,
[compiler_generated],
[{c_var,[],'_cor1'}],
{c_literal,[],true},
{c_primop,[],
{c_literal,[],match_fail},
[{c_tuple,[],
[{c_literal,[],case_clause},
{c_var,[],'_cor1'}]}]}}]}}},
{{c_var,[],{module_info,0}},
{c_fun,[],[],
{c_call,[],
{c_literal,[],erlang},
{c_literal,[],get_module_info},
[{c_literal,[],basic_types}]}}},
{{c_var,[],{module_info,1}},
{c_fun,[],
[{c_var,[],'_cor0'}],
{c_call,[],
{c_literal,[],erlang},
{c_literal,[],get_module_info},
[{c_literal,[],basic_types},{c_var,[],'_cor0'}]}}}]}
But no line col/nos. So I can get an AST that will generate JS - but critically not SourceMaps.
Question 1 How can I get the line information I need - (I can already get column information from the 'normal' Erlang tokens...)
Erlang Core is slightly different to normal Erlang in the production process because it starts substituting variable names in function calls for its own internal ones which will also cause some Source Map problems. An example would be this Erlang clause:
bish_fn(A) ->
case A of
bash -> berk;
bosh -> bork
end.
The Erlang AST preserves the names nicely:
[{function,
{31,{1,8}},
bish_fn,1,
[{clause,
{31,none},
[{var,{31,{11,12}},'A'}],
[[]],
[{'case',
{32,none},
[{var,{32,{11,12}},'A'}],
[{clause,
{33,none},
[{atom,{33,{9,13}},bash}],
[[]],
[{atom,{34,{13,17}},berk}]},
{clause,
{35,none},
[{atom,{35,{9,13}},bosh}],
[[]],
[{atom,{36,{13,17}},bork}]}]}]}]}]},
Core Erlang has already mutated away the names of the parameters called in the function:
'bish_fn'/1 =
%% Line 30
fun (_cor0) ->
%% Line 31
case _cor0 of
%% Line 32
<'bash'> when 'true' ->
'berk'
%% Line 33
<'bosh'> when 'true' ->
'bork'
( <_cor1> when 'true' ->
primop 'match_fail'
({'case_clause',_cor1})
-| ['compiler_generated'] )
end
Question 2 is there anything I can to to preserve or map variable names in Core Erlang?
Question 3 I appreciate that Core Erlang is explicity designed to make it easy to compile into Erlang and write tools that mutate Erlang Code - but the question really it will it make it easier to compile out of Erlang?
I could fork the core erlang code and add a source mapping options but I play the Lazy Man card here...
In response to Eric's response, I should clarify how I am generating the Core Erlang cerl records. I first compile my plain Erlang to core erlang using:
c(some_module, to_core)
Then I use core_scan
and core_parse
in this function nicked from compiler.erl
:
compile(File) ->
case file:read_file(File) of
{ok,Bin} ->
case core_scan:string(binary_to_list(Bin)) of
{ok,Toks,_} ->
case core_parse:parse(Toks) of
{ok, Mod} ->
{ok, Mod};
{error,E} ->
{error, {parse, E}}
end;
{error,E,_} ->
{error, {scan, E}}
end;
{error,E} ->
{error,{read, E}}
end.
The question is how do I/can I get that toolchain to emit an annotated AST. I suspect I would need to add those options myself :(
Line numbers are provided as annotations. If you look at the cerl module, which I really recommend you use, you will see everything pretty much takes a list of annotations. One of those annotations is an unadorned number that represents the line number. If I remember correctly for Core AST directly and the atom1_fn var was on line 10. The AST would look as follows:
{c_var,[10],{atom1_fn,0}}
No, you have to do all the bookkeeping yourself. There isn't anything out there to do it for you.
I am not sure I understand this question.
Everything Anthony said was true about Core Erlang. Those are the very same reasons I chose Core Erlang as a target language for Joxa. The lesson I learned from that is that while Core Erlang is a great easy to target target language it has two major drawbacks that recommend against it.
Dialyzer only works with an Erlang AST in the abstract code block of the beam file. There is no way to get such an AST into that abstract code block when compiling to Core Erlang. So if you target Core Erlang, Dialyzer wont work for you. That is true regardless of whether or not you produce the correct spec attributes.
You lose the use of tools that work on the Erlang AST. For example, the ability to compile to Erlang Source. The Core Erlang to/from source compilers are very buggy and simply do not work. This is a major win in a lot of areas of pragmatic use.
I am actually in the process of retargeting Joxa to the Erlang AST for the above reasons.
Btw, you might be interested in this project. https://github.com/5HT/shen. Its a JavaScript compiler for the Erlang AST that already exists and is working. Though I don't have a lot of experience with it.
** Edit: You can actually see a core erlang AST generated from Erlang source. This helps a ton when learning how to compile to core. ec_compile
in the erlware_commons
repo has a lot of utility functions to help with that.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With