****Solution to Issue 1 by Stephan - see Answer below****
I mark \
as an escape character in the syntax table, but then override that designation for the Mathematica syntax elements like \[Infinity]
. Here is my syntax-propertize-function
:
(defconst math-syntax-propertize-function
(syntax-propertize-rules
("\\\\\\[\\([A-Z][A-Za-z]*\\)]" (0 "_"))))
I referenced it from the (defun math-node()
function like so:
(set (make-local-variable 'syntax-propertize-function)
math-syntax-propertize-function)
In my first attempt, I didn't use the make-local-variable
function and I was surprised when my elisp buffer highlighting went awry.
****End Solution to Issue 1****
I am implementing a major-mode in Emacs derived from cc-mode for editing Mathematica files. The goal is syntax highlighting and indentation. I will leave interfacing with the Mathematica kernel for later.
I have the basic functionality working, but there are a couple of sticking points that are giving me trouble.
****Issue 1** - The \
character is used as an escape character and to prefix multi-character, bracketed keywords. **
Like many languages, Mathematica uses the \
character to escape "
and other \
characters is strings.
Mathematic has what are called in Mathematica speak Syntax Characters like \[Times]
, \[Element]
, \[Infinity]
, etc. that represent mathematica operators and constants.
And, Mathematica makes heavy use of [
and ]
instead of (
and )
for function definitions and calls, etc.
So, if I mark \
as an escape character in the syntax-table, then my brackets become mis-matched anywhere I use a Syntax Character. E.g.,
If[x < \[Pi], True, False]
Of course, cc-mode is intent on ignoring the [
right after the \
. Given the functional nature of Mathematica, the mode is almost useless if it cannot match brackets. Think lisp without paren matching.
If I don't put \
in the syntax-table as an escape character, then how do I handle escape sequences in comments and strings?
It would be great if I could put Times, Element, Infinity, etc in a keyword list and have everything work correctly.
****Issue 2** - The syntax of Mathematica is different enough from C,C++,Java,ObjC, etc. that cc-mode's builtin syntactical analysis doesn't always produce the desired result.**
Consider the following code block:
FooBar[expression1,
expression2,
expression3];
This formats beautifully because the expressions are recognized as an argument list.
However, if a list is passed as an argument,
FooBar[{expression1,
expression2,
expression3}];
the result is not pretty because the expressions are considered continuations of a single statement within the {
and }
. Unfortunately, the simple hack of setting c-continuation-offset
to 0
breaks actual continuations like,
addMe[x_Real, y_Real] :=
Plus[x, y];
which you want to be indented.
The issue is that in Mathematica {
and }
delineate lists and not code blocks.
Here is the current elisp file I am using:
(require 'cc-mode)
;; There are required at compile time to get the sources for the
;; language constants.
(eval-when-compile
(require 'cc-langs)
(require 'cc-fonts))
;; Add math mode the the language constant system. This needs to be
;; done at compile time because that is when the language constants
;; are evaluated.
(eval-and-compile
(c-add-language 'math-mode 'c-mode))
;; Function names
(c-lang-defconst c-cpp-matchers
math (append
(c-lang-const c-cpp-matchers c)
;; Abc[
'(("\\<\\([A-Z][A-Za-z0-9]*\\)\\>\\[" 1 font-lock-type-face))
;; abc[
'(("\\<\\([A-Za-z][A-Za-z0-9]*\\)\\>\\[" 1 font-lock-function-name-face))
;; Abc
'(("\\<\\([A-Z][A-Za-z0-9]*\\)\\>" 1 font-lock-keyword-face))
;; abc_
'(("\\<\\([a-z][A-Za-z0-9]*[_]\\)\\>" 1 font-lock-variable-name-face))
))
;; font-lock-comment-face
;; font-lock-doc-face
;; font-lock-string-face
;; font-lock-keyword-fact
;; font-lock-function-name-face
;; font-lock-constant-face
;; font-lock-type-face
;; font-lock-builtin-face
;; font-lock-reference-face
;; font-lock-warning-face
;; There is no line comment character.
(c-lang-defconst c-line-comment-starter
math nil)
;; The block comment starter is (*.
(c-lang-defconst c-block-comment-starter
math "(*")
;; The block comment ender is *).
(c-lang-defconst c-block-comment-ender
math "*)")
;; The assignment operators.
(c-lang-defconst c-assignment-operators
math '("=" ":=" "+=" "-=" "*=" "/=" "->" ":>"))
;; The operators.
(c-lang-defconst c-operators
math `(
;; Unary.
(prefix "+" "-" "!")
;; Multiplicative.
(left-assoc "*" "/")
;; Additive.
(left-assoc "+" "-")
;; Relational.
(left-assoc "<" ">" "<=" ">=")
;; Equality.
(left-assoc "==" "=!=")
;; Assignment.
(right-assoc ,@(c-lang-const c-assignment-operators))
;; Sequence.
(left-assoc ",")))
;; Syntax modifications necessary to recognize keywords with
;; punctuation characters.
;; (c-lang-defconst c-identifier-syntax-modifications
;; math (append '((?\\ . "w"))
;; (c-lang-const c-identifier-syntax-modifications)))
;; Constants.
(c-lang-defconst c-constant-kwds
math '( "False" "True" )) ;; "\\[Infinity]" "\\[Times]" "\\[Divide]" "\\[Sqrt]" "\\[Element]"\
))
(defcustom math-font-lock-extra-types nil
"Extra types to recognize in math mode.")
(defconst math-font-lock-keywords-1 (c-lang-const c-matchers-1 math)
"Minimal highlighting for math mode.")
(defconst math-font-lock-keywords-2 (c-lang-const c-matchers-2 math)
"Fast normal highlighting for math mode.")
(defconst math-font-lock-keywords-3 (c-lang-const c-matchers-3 math)
"Accurate normal highlighting for math mode.")
(defvar math-font-lock-keywords math-font-lock-keywords-3
"Default expressions to highlight in math mode.")
(defvar math-mode-syntax-table nil
"Syntax table used in math mode.")
(message "Setting math-mode-syntax-table to nil to force re-initialization")
(setq math-mode-syntax-table nil)
;; If a syntax table has not yet been set, allocate a new syntax table
;; and setup the entries.
(unless math-mode-syntax-table
(setq math-mode-syntax-table
(funcall (c-lang-const c-make-mode-syntax-table math)))
(message "Modifying the math-mode-syntax-table")
;; character (
;; ( - open paren class
;; ) - matching paren character
;; 1 - 1st character of comment delimitter (**)
;; n - nested comments allowed
(modify-syntax-entry ?\( "()1n" math-mode-syntax-table)
;; character )
;; ) - close parent class
;; ( - matching paren character
;; 4 - 4th character of comment delimitter (**)
;; n - nested comments allowed
(modify-syntax-entry ?\) ")(4n" math-mode-syntax-table)
;; character *
;; . - punctuation class
;; 2 - 2nd character of comment delimitter (**)
;; 3 - 3rd character of comment delimitter (**)
(modify-syntax-entry ?\* ". 23n" math-mode-syntax-table)
;; character [
;; ( - open paren class
;; ] - matching paren character
(modify-syntax-entry ?\[ "(]" math-mode-syntax-table)
;; character ]
;; ) - close paren class
;; [ - mathcing paren character
(modify-syntax-entry ?\] ")[" math-mode-syntax-table)
;; character {
;; ( - open paren class
;; } - matching paren character
(modify-syntax-entry ?\{ "(}" math-mode-syntax-table)
;; character }
;; ) - close paren class
;; { - matching paren character
(modify-syntax-entry ?\} "){" math-mode-syntax-table)
;; The following characters are punctuation (i.e. they cannot appear
;; in identifiers).
;;
;; / ' % & + - ^ < > = |
(modify-syntax-entry ?\/ "." math-mode-syntax-table)
(modify-syntax-entry ?\' "." math-mode-syntax-table)
(modify-syntax-entry ?% "." math-mode-syntax-table)
(modify-syntax-entry ?& "." math-mode-syntax-table)
(modify-syntax-entry ?+ "." math-mode-syntax-table)
(modify-syntax-entry ?- "." math-mode-syntax-table)
(modify-syntax-entry ?^ "." math-mode-syntax-table)
(modify-syntax-entry ?< "." math-mode-syntax-table)
(modify-syntax-entry ?= "." math-mode-syntax-table)
(modify-syntax-entry ?> "." math-mode-syntax-table)
(modify-syntax-entry ?| "." math-mode-syntax-table)
;; character $
;; _ - word class (since $ is allowed in identifier names)
(modify-syntax-entry ?\$ "_" math-mode-syntax-table)
;; character \
;; . - punctuation class (for now treat \ as punctuation
;; until we can fix the \[word] issue).
(modify-syntax-entry ?\\ "." math-mode-syntax-table)
) ;; end of math-mode-syntax-table adjustments
;;
;;
(defvar math-mode-abbrev-table nil
"Abbrevation table used in math mode buffers.")
(defvar math-mode-map (let ((map (c-make-inherited-keymap)))
map)
"Keymap used in math mode buffers.")
;; math-mode
;;
(defun math-mode ()
"Major mode for editing Mathematica code."
(interactive)
(kill-all-local-variables)
(c-initialize-cc-mode t)
(set-syntax-table math-mode-syntax-table)
(setq major-mode 'math-mode
mode-name "Math"
local-abbrev-table math-mode-abbrev-table
abbrev-mode t)
(use-local-map math-mode-map)
(c-init-language-vars math-mode)
(c-common-init 'math-mode)
(run-hooks 'c-mode-common-hook)
(run-hooks 'math-mode-hook)
(c-update-modeline))
(provide 'math-mode)
And a screenshot of some .
While cc-mode is designed to be adaptable to various languages, I'm not sure it will serve you well for Mathematica, because the syntax is too far from the one well-supported by cc-mode. I would suggest to try SMIE (an indentation engine that appeared in Emacs-23.4 and that was originally built for SML but is currently used for a variety of languages). Just like cc-mode, SMIE is not ideal for all languages either, but I wouldn't be surprised if it works better than cc-mode in your case.
For the backslash issue, your best bet is to use syntax-propertize-function
to change the escaping-nature of specific backslashes (either set \ as escaping in the syntax-table and then mark the \ of \[foo] as non-escaping, or leave the \ as non-escaping in the syntax-table and then mark those \ of \" and \\ as escaping).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With