Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing an OCaml file with OCaml

I want to analysis OCaml files (.ml) using OCaml. I want to break the files into Abstract Syntax Trees for analysis. I have attempted to use camlp4 but have had no luck. Has anyone else successfully done this before? Is this the best way to parse an OCaml file?

like image 535
cmanning Avatar asked Apr 23 '14 02:04

cmanning


People also ask

What is a module in OCaml?

ocaml modules allow to package together data structures definitions and functions operating on them. This allow to reduce code size and name confusion, as any identifier can appear in different modules, with different meanings, without any interference.

What is :: In OCaml?

:: means 2 camel humps, ' means 1 hump! – Nick Craver. Feb 27, 2010 at 12:09. Ok a decent comment: merd.sourceforge.net/pixel/language-study/… I don't use oCaml, but there's a full syntax list you my find useful, even if there's no real context to it.


2 Answers

(I assume you know basic parts of OCaml already: how to write OCaml code, how to link modules and libraries, how to write build scripts and so on. If you do not, learn them first.)

The best way is to use the genuine OCaml code parser used in OCaml compiler itself, since it is 100% compatible by definition.

CamlP4 also implements OCaml parser but it is slightly incompatible with the genuine parser and the parse tree is somewhat specialized for writing syntax extensions: not very good for any other kind of analysis.

You may want to parse .ml files with syntax extensions using P4. Even in this case, you should stick to the genuine parser: you can desugar the source code by P4 then send the result to your analyzer with the genuine parser.

To use OCaml compiler's parser, the easiest approach is to use compiler-libs.common OCamlFind package. It contains the parser and type checker of OCaml compiler.

Start from modifying driver/compile.ml of OCaml compiler source, it implements the major compilation phases: calling preprocessor, parse, typing then code generation. To parse .ml files you should modify (or simplify) Compile.implementation. For .mli files Compile.interface.

Good luck.

like image 191
camlspotter Avatar answered Sep 27 '22 23:09

camlspotter


Couldn't you use the -dparsetree option to the ocaml compiler?

hello.ml:

let _ = print_endline "Hello AST"

Now compile it:

$ ocamlc -dparsetree hello.ml

Which results in:

[
  structure_item (hello.ml[1,0+0]..[1,0+33])
    Pstr_eval
    expression (hello.ml[1,0+8]..[1,0+33])
      Pexp_apply
      expression (hello.ml[1,0+8]..[1,0+21])
        Pexp_ident "print_endline" (hello.ml[1,0+8]..[1,0+21])
      [
        <label> ""
          expression (hello.ml[1,0+22]..[1,0+33])
            Pexp_constant Const_string("Hello AST",None)
      ]
]

See also this blog post on -ppx extensions which has some info on extension point syntax extensions (the new way of writing syntax extensions in OCaml 4.02). There is info there on various AST manipulation modules.

like image 24
aneccodeal Avatar answered Sep 28 '22 00:09

aneccodeal