Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the "semantic model" introduced in Apocalypse #1?

In Apocalypse #1 Larry wrote, with my added emphasis:

Raku will support multiple syntaxes that map onto a single semantic model. Second, that single semantic model will in turn map to multiple platforms.

Some vague notions I have of what Larry might have meant when he wrote "single semantic model":

  • A turing complete language / automaton; and/or

  • What became 6model; and/or

  • What became NQP/nqp.

(I've googled around and caught some discussions, eg this one on slashdot, but they were equally vague.)

Perhaps more important than an answer to what he was thinking then is what has come to pass.

His formulation sure sounds a lot like it might map to NQP and/or nqp (in the middle of the Raku / NQP/nqp / NQP backends architecture).

(If so, presumably that model is "specified" by nqp's equivalent to Raku's roast?)

Or, per Liz++, QAST, or RAST?


I know who I think would be able to best answer my main question (in the title) but perhaps someone else knows?

like image 503
raiph Avatar asked Jul 09 '20 22:07

raiph


Video Answer


1 Answers

Apocalypse #1 represents some of the earliest thinking from Larry in the process that led us to the Raku language we know today. I wasn't around that early in the language design process, so any answer I give will naturally involve trying to imagine what was known at that point in time. With that rather significant caveat out of the way, let's take a look.

Syntax is about the words and symbols we write. Semantics is about what things mean. For example, lets assume we're in a language that has such a thing as infix operators, and there is an operator spelled +. We might write the expression a + b. Semantics tell us what it means. While many programming languages have this syntax, they differ hugely in the semantics - that is, meaning - associated with it. For example:

  • In C, it depends on the types of a and b. It may mean some kind of numeric addition (with a whole bunch of rules based on integer, integer rank, floating point, etc.). However, if a is a pointer and b is an integer, there's actually a sneaky multiplication going on in there too, based on the pointer size.
  • In C++, see the definitions from C, but also it could be a function call to an operator overload too, and/or any of those semantics but obtained after applying conversion rules on the operands. Please don't ask me what those rules are.
  • In Java, it also goes by type; it might mean numeric addition, or it might mean string concatenation.
  • In JavaScript, it might be numeric addition or string concatenation, but it's decided at runtime, according to rules. No, don't ask me about these ones either.
  • In Raku it's a function call to infix:<+>, and that means whatever the standard library decides it means.

To me, a semantic model is a systematic way of describing the semantics. That might exist as one or more of:

  • A written (natural language) specification that tries to describe what things should do
  • An executable specification that tries to describe what things do (like the Raku spectests)
  • An expression of the semantics using a mathematical formalism, such as operational semantics or denotational semantics
  • An interpreter implemented in some other language (in which case we lean on its semantic model)
  • A compiler translating into some other language (again, we lean on the semantic model of the target language)

Just as we've observed that the syntax a + b might map to many different semantics across different languages, we can also have many syntaxes mapping to the same set of semantics. That's true even in standard Raku; there's no semantic difference between writing $a + $b and infix:<+>($a, $b).

While this maybe provides something of an answer, it's interesting to read the paragraphs that follow the bit you quoted. Here's what follows, annotated.

Multiple syntaxes sound like an evil thing, but they're really necessary for the evolution of the language.

I think the use of "evolution" is significant here, because allowing for the syntax of the language to change in a controlled way does, in fact, allow different mutations of the language syntax to coexist. Further, survival of the fittest can apply (and what is fit may well be a function of the context in which the language is being used). Given syntax is the interface to the language, and what is deserving of huffmanization, for example, can change in time or in context, it's not unreasonable to expect that it might evolve, while still providing access to the same underlying set of behaviors.

I think we can see this as foreseeing features like user-defined operators and slangs, anyway.

To some extent we already have a multi-syntax model in Perl 5; every time you use a pragma or module, you are warping the language you're using.

I find this part a bit odd, because "the language" is not just syntax, but also semantics, and in fact a lot of pragmas change semantics rather than (just) syntax. On the other hand, it did say "to some extent", which is a pretty good hedge to hide behind. :-)

As long as it's clear from the declarations at the top of the module which version of the language you're using, this causes little problem.

This means that language mutations are scoped things. They ended up lexically scoped, not just file scoped. This isn't entirely surprising; the utility of lexical scoping seems to have been increasingly realized during the design process.

A particularly strong example of how support of multiple syntaxes will allow continued evolution is the migration from Perl 5 to Perl 6 itself.

This indicates that the thinking at the time was that Perl 5 and Perl 6 (now Raku) would have enough in common that they could share a semantic model, and run atop of the same runtime. As we know, things did not pan out this way, however at the time Apocalypse #1 was written, I can imagine that was an assumption. In fact, it probably remained one for quite a while; for example, PONIE (the project to try and run Perl 5 atop of the Parrot VM) was ongoing a number of years later.

In reality, as the language design emerged, the single semantic model that would have allowed for that became unrealistic. Various efforts to take features from Perl 6 into Perl 5 struggled for this reason. Smart match is the poster child for this, and the problem was not at all because of syntax, but because of semantics: in Raku, things always know their type, whereas in Perl 5, a scalar may simultaneously hold a string and numeric representation, depending on how the value has been used up until that point. The feature was predicated on something in the Raku semantic model that had no direct equivalent in the Perl 5 semantic model.

A further point of interest is that the RakuAST work that is currently ongoing will provide a document object model form of the Raku language. We could see this as an alternative syntax for Raku expressed as an object graph. Given it will also be the representation the compiler frontend uses for Raku code, we can also see it as a kind of syntax-independent gateway to the Raku semantic model. And, when we really reach having slangs, it can be expected that they will be implemented by expressing the semantics associated with the additional slang syntax as a composition of RakuAST nodes - and so ultimately, will be delivering new syntax in terms of a single Raku semantic model.

like image 166
Jonathan Worthington Avatar answered Sep 29 '22 11:09

Jonathan Worthington