Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do you implement syntax highlighting?

I am embarking on some learning and I want to write my own syntax highlighting for files in C++.

Can anyone give me ideas on how to go about doing this?

To me it seems that when a file is opened:

  1. It would need to be parsed and decided what type of source file it is. Trusting the extension might not be fool-proof

  2. A way to know what keywords/commands apply to what language

  3. A way to decide what color each keyword/command gets

I want to do this on OS X, using C++ or Objective-C.

Can anyone provide pointers on how I might get started with this?

like image 248
MLS Avatar asked Apr 17 '10 00:04

MLS


1 Answers

Syntax highlighters typically don't go beyond lexical analysis, which means you don't have to parse the whole language into statements and declarations and expressions and whatnot. You only have to write a lexer, which is fairly easy with regular expressions. I recommend you start by learning regular expressions, if you haven't already. It'll take all of 30 minutes.

You may want to consider toying with Flex ( the lexical analyzer generator; https://github.com/westes/flex ) as a learning exercise. It should be quite easy to implement a basic syntax highlighter in Flex that outputs highlighted HTML or something.

In short, you would give Flex a set of regular expressions and what to do with matching text, and the generator will greedily match against your expressions. You can make your lexer transition among exclusive states (e.g. in and out of string literals, comments, etc.) as shown in the flex FAQ. Here's a canonical example of a lexer for C written in Flex: http://www.lysator.liu.se/c/ANSI-C-grammar-l.html .

Making an extensible syntax highlighter would be the next part of your journey. Although I am by no means a fan of XML, take a look at how Kate syntax highlighting files are defined, such as this one for C++ . Your task would be to figure out how you want to define syntax highlighters, then make a program that uses those definitions to generate HTML or whatever you please.

like image 146
Joey Adams Avatar answered Sep 29 '22 16:09

Joey Adams