Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Emacs major-mode for Mathematica based on cc-mode

****Solution to Issue 1 by Stephan - see Answer below****

I mark \ as an escape character in the syntax table, but then override that designation for the Mathematica syntax elements like \[Infinity]. Here is my syntax-propertize-function:

(defconst math-syntax-propertize-function
  (syntax-propertize-rules
   ("\\\\\\[\\([A-Z][A-Za-z]*\\)]" (0 "_"))))

I referenced it from the (defun math-node() function like so:

  (set (make-local-variable 'syntax-propertize-function)
       math-syntax-propertize-function)

In my first attempt, I didn't use the make-local-variable function and I was surprised when my elisp buffer highlighting went awry.

****End Solution to Issue 1****

I am implementing a major-mode in Emacs derived from cc-mode for editing Mathematica files. The goal is syntax highlighting and indentation. I will leave interfacing with the Mathematica kernel for later.

I have the basic functionality working, but there are a couple of sticking points that are giving me trouble.

****Issue 1** - The \ character is used as an escape character and to prefix multi-character, bracketed keywords. **

Like many languages, Mathematica uses the \ character to escape " and other \ characters is strings.

Mathematic has what are called in Mathematica speak Syntax Characters like \[Times], \[Element], \[Infinity], etc. that represent mathematica operators and constants.

And, Mathematica makes heavy use of [ and ] instead of ( and ) for function definitions and calls, etc.

So, if I mark \ as an escape character in the syntax-table, then my brackets become mis-matched anywhere I use a Syntax Character. E.g.,

    If[x < \[Pi], True, False]

Of course, cc-mode is intent on ignoring the [ right after the \. Given the functional nature of Mathematica, the mode is almost useless if it cannot match brackets. Think lisp without paren matching.

If I don't put \ in the syntax-table as an escape character, then how do I handle escape sequences in comments and strings?

It would be great if I could put Times, Element, Infinity, etc in a keyword list and have everything work correctly.

****Issue 2** - The syntax of Mathematica is different enough from C,C++,Java,ObjC, etc. that cc-mode's builtin syntactical analysis doesn't always produce the desired result.**

Consider the following code block:

    FooBar[expression1,
           expression2,
           expression3];

This formats beautifully because the expressions are recognized as an argument list.

However, if a list is passed as an argument,

    FooBar[{expression1,
                expression2,
                expression3}];

the result is not pretty because the expressions are considered continuations of a single statement within the { and }. Unfortunately, the simple hack of setting c-continuation-offset to 0 breaks actual continuations like,

    addMe[x_Real, y_Real] :=
        Plus[x, y];

which you want to be indented.

The issue is that in Mathematica { and } delineate lists and not code blocks.

Here is the current elisp file I am using:

(require 'cc-mode)

;; There are required at compile time to get the sources for the                                
;; language constants.                                                                          
(eval-when-compile
  (require 'cc-langs)
  (require 'cc-fonts))

;; Add math mode the the language constant system. This needs to be                             
;; done at compile time because that is when the language constants                             
;; are evaluated.                                                                               
(eval-and-compile
  (c-add-language 'math-mode 'c-mode))


;; Function names                                                                               
(c-lang-defconst c-cpp-matchers
  math (append
        (c-lang-const c-cpp-matchers c)
        ;; Abc[                                                                                 
        '(("\\<\\([A-Z][A-Za-z0-9]*\\)\\>\\[" 1 font-lock-type-face))
        ;; abc[                                                                                 
        '(("\\<\\([A-Za-z][A-Za-z0-9]*\\)\\>\\[" 1 font-lock-function-name-face))
        ;; Abc                                                                                  
        '(("\\<\\([A-Z][A-Za-z0-9]*\\)\\>" 1 font-lock-keyword-face))
        ;; abc_                                                                                 
        '(("\\<\\([a-z][A-Za-z0-9]*[_]\\)\\>" 1 font-lock-variable-name-face))
        ))

;; font-lock-comment-face                                                                       
;; font-lock-doc-face                                                                           
;; font-lock-string-face                                                                        
;; font-lock-keyword-fact                                                                       
;; font-lock-function-name-face                                                                 
;; font-lock-constant-face                                                                      
;; font-lock-type-face                                                                          
;; font-lock-builtin-face                                                                       
;; font-lock-reference-face                                                                     
;; font-lock-warning-face                                                                       


;; There is no line comment character.                                                          
(c-lang-defconst c-line-comment-starter
  math nil)

;; The block comment starter is (*.                                                             
(c-lang-defconst c-block-comment-starter
  math "(*")

;; The block comment ender is *).                                                               
(c-lang-defconst c-block-comment-ender
  math "*)")

;; The assignment operators.                                                                    
(c-lang-defconst c-assignment-operators
  math '("=" ":=" "+=" "-=" "*=" "/=" "->" ":>"))

;; The operators.                                                                               
(c-lang-defconst c-operators
  math `(
         ;; Unary.                                                                              
         (prefix "+" "-" "!")
         ;; Multiplicative.                                                                     
         (left-assoc "*" "/")
         ;; Additive.                                                                           
         (left-assoc "+" "-")
         ;; Relational.                                                                         
         (left-assoc "<" ">" "<=" ">=")
         ;; Equality.                                                                           
         (left-assoc "==" "=!=")  
         ;; Assignment.                                                                         
         (right-assoc ,@(c-lang-const c-assignment-operators))
         ;; Sequence.                                                                           
         (left-assoc ",")))


;; Syntax modifications necessary to recognize keywords with                                    
;; punctuation characters.                                                                      
;; (c-lang-defconst c-identifier-syntax-modifications                                           
;;   math (append '((?\\ . "w"))                                                                
;;             (c-lang-const c-identifier-syntax-modifications)))                               

;; Constants.                                                                                   
(c-lang-defconst c-constant-kwds
  math '( "False" "True" )) ;; "\\[Infinity]" "\\[Times]" "\\[Divide]" "\\[Sqrt]" "\\[Element]"\
))                                                                                              

(defcustom math-font-lock-extra-types nil
  "Extra types to recognize in math mode.")

(defconst math-font-lock-keywords-1 (c-lang-const c-matchers-1 math)
  "Minimal highlighting for math mode.")

(defconst math-font-lock-keywords-2 (c-lang-const c-matchers-2 math)
  "Fast normal highlighting for math mode.")

(defconst math-font-lock-keywords-3 (c-lang-const c-matchers-3 math)
  "Accurate normal highlighting for math mode.")

(defvar math-font-lock-keywords math-font-lock-keywords-3
  "Default expressions to highlight in math mode.")

(defvar math-mode-syntax-table nil
  "Syntax table used in math mode.")

(message "Setting math-mode-syntax-table to nil to force re-initialization")
(setq math-mode-syntax-table nil)

;; If a syntax table has not yet been set, allocate a new syntax table                          
;; and setup the entries.                                                                       
(unless math-mode-syntax-table
  (setq math-mode-syntax-table
        (funcall (c-lang-const c-make-mode-syntax-table math)))

  (message "Modifying the math-mode-syntax-table")

  ;; character (                                                                                
  ;; ( - open paren class                                                                       
  ;; ) - matching paren character                                                               
  ;; 1 - 1st character of comment delimitter (**)                                               
  ;; n - nested comments allowed                                                                
  (modify-syntax-entry ?\( "()1n" math-mode-syntax-table)

  ;; character )                                                                                
  ;; ) - close parent class                                                                     
  ;; ( - matching paren character                                                               
  ;; 4 - 4th character of comment delimitter (**)                                               
  ;; n - nested comments allowed                                                                
  (modify-syntax-entry ?\) ")(4n" math-mode-syntax-table)

  ;; character *                                                                                
  ;; . - punctuation class                                                                      
  ;; 2 - 2nd character of comment delimitter (**)    
  ;; 3 - 3rd character of comment delimitter (**)                                               
  (modify-syntax-entry ?\* ". 23n" math-mode-syntax-table)

  ;; character [                                                                                
  ;; ( - open paren class                                                                       
  ;; ] - matching paren character                                                               
  (modify-syntax-entry ?\[ "(]" math-mode-syntax-table)

  ;; character ]                                                                                
  ;; ) - close paren class                                                                      
  ;; [ - mathcing paren character                                                               
  (modify-syntax-entry ?\] ")[" math-mode-syntax-table)

  ;; character {                                                                                
  ;; ( - open paren class                                                                       
  ;; } - matching paren character                                                               
  (modify-syntax-entry ?\{ "(}" math-mode-syntax-table)

  ;; character }                                                                                
  ;; ) - close paren class                                                                      
  ;; { - matching paren character                                                               
  (modify-syntax-entry ?\} "){" math-mode-syntax-table)

  ;; The following characters are punctuation (i.e. they cannot appear                          
  ;; in identifiers).                                                                           
  ;;                                                                                            
  ;; / ' % & + - ^ < > = |                                                                      
  (modify-syntax-entry ?\/ "." math-mode-syntax-table)
  (modify-syntax-entry ?\' "." math-mode-syntax-table)
  (modify-syntax-entry ?% "." math-mode-syntax-table)
  (modify-syntax-entry ?& "." math-mode-syntax-table)
  (modify-syntax-entry ?+ "." math-mode-syntax-table)
  (modify-syntax-entry ?- "." math-mode-syntax-table)
  (modify-syntax-entry ?^ "." math-mode-syntax-table)
  (modify-syntax-entry ?< "." math-mode-syntax-table)
  (modify-syntax-entry ?= "." math-mode-syntax-table)
  (modify-syntax-entry ?> "." math-mode-syntax-table)
  (modify-syntax-entry ?| "." math-mode-syntax-table)

  ;; character $                                                                                
  ;; _ - word class (since $ is allowed in identifier names)                                    
  (modify-syntax-entry ?\$ "_" math-mode-syntax-table)

  ;; character \                                                                                
  ;; . - punctuation class (for now treat \ as punctuation                                      
  ;;     until we can fix the \[word] issue).                                                   
  (modify-syntax-entry ?\\ "." math-mode-syntax-table)

  ) ;; end of math-mode-syntax-table adjustments                                                

;;                                                                                              
;;                                                                                              
(defvar math-mode-abbrev-table nil
  "Abbrevation table used in math mode buffers.")

(defvar math-mode-map (let ((map (c-make-inherited-keymap)))
                        map)
  "Keymap used in math mode buffers.")

;; math-mode                                                                                    
;;                                                                                              
(defun math-mode ()
  "Major mode for editing Mathematica code."

  (interactive)
  (kill-all-local-variables)

  (c-initialize-cc-mode t)

  (set-syntax-table math-mode-syntax-table)

  (setq major-mode 'math-mode
        mode-name "Math"
        local-abbrev-table math-mode-abbrev-table
        abbrev-mode t)

  (use-local-map math-mode-map)

  (c-init-language-vars math-mode)
  (c-common-init 'math-mode)

  (run-hooks 'c-mode-common-hook)
  (run-hooks 'math-mode-hook)
  (c-update-modeline))

(provide 'math-mode)                   

And a screenshot of some formatted code.

like image 665
RandomBits Avatar asked Mar 11 '13 04:03

RandomBits


1 Answers

While cc-mode is designed to be adaptable to various languages, I'm not sure it will serve you well for Mathematica, because the syntax is too far from the one well-supported by cc-mode. I would suggest to try SMIE (an indentation engine that appeared in Emacs-23.4 and that was originally built for SML but is currently used for a variety of languages). Just like cc-mode, SMIE is not ideal for all languages either, but I wouldn't be surprised if it works better than cc-mode in your case.

For the backslash issue, your best bet is to use syntax-propertize-function to change the escaping-nature of specific backslashes (either set \ as escaping in the syntax-table and then mark the \ of \[foo] as non-escaping, or leave the \ as non-escaping in the syntax-table and then mark those \ of \" and \\ as escaping).

like image 144
Stefan Avatar answered Sep 29 '22 06:09

Stefan