Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PL/SQL ANTLR grammar fails for some PL/SQL files?

I'am using ANTLR4 to generate Abstract Syntax Tree (AST) for PL/SQL codes. For some queries it works fine, but for some codes it generates an AST with only one node which is not right.

For example:

DECLARE
   a   RAW;            -- migrate to BLOB
   b   LONG RAW;       -- migrate to BLOB
   c   LONG VARCHAR;   -- migrate to LOB
   d   LONG;           -- migrate to LOB
   x   VARCHAR;
   CURSOR mycur RETURN LONG;     -- should flag
   FUNCTION myfunc RETURN LONG RAW    -- should flag
   IS
      z  LONG RAW;    -- should flag
   BEGIN 
      RETURN z;
   END;  
BEGIN
   SELECT mycol, CAST(col2 AS RAW)      -- should flag
     INTO a 
     FROM mytab 
    WHERE (b IS OF TYPE(LONG RAW, RAW, VARCHAR)); -- should flag
END;
CREATE TABLE tab (
   a   RAW,            -- should flag
   b   LONG RAW,       -- should flag
   c   LONG VARCHAR,   -- should flag
   d   LONG,           -- should flag
   x   VARCHAR
);

For this code it generates this AST:

(compilation_unit DECLARE a RAW ; b LONG RAW ; c LONG VARCHAR ; d LONG ; x VARCHAR ; CURSOR mycur RETURN LONG ; FUNCTION myfunc RETURN LONG RAW IS z LONG RAW ; BEGIN RETURN z ; END ; BEGIN SELECT mycol , CAST ( col2 AS RAW ) INTO a FROM mytab WHERE ( b IS OF TYPE ( LONG RAW , RAW , VARCHAR ) ) ; END ; CREATE TABLE tab ( a RAW , b LONG RAW , c LONG VARCHAR , d LONG , x VARCHAR ) ;)

which is merely the given code within compilation_unit node.

But if i give same code without declare part it gives good AST.

For this code:

FUNCTION myfunc RETURN LONG RAW    -- should flag
   IS
      z  LONG RAW;    -- should flag
   BEGIN 
      RETURN z;
   END;  
BEGIN
   SELECT mycol, CAST(col2 AS RAW)      -- should flag
     INTO a 
     FROM mytab 
    WHERE (b IS OF TYPE(LONG RAW, RAW, VARCHAR)); -- should flag
END;
CREATE TABLE tab (
   a   RAW,            -- should flag
   b   LONG RAW,       -- should flag
   c   LONG VARCHAR,   -- should flag
   d   LONG,           -- should flag
   x   VARCHAR
);

it gives this AST:

(compilation_unit (unit_statement (create_function_body FUNCTION (function_name (id (id_expression (regular_id myfunc)))) RETURN (type_spec (datatype (native_datatype_element LONG RAW))) IS (declare_spec (variable_declaration (variable_name (id_expression (regular_id z))) (type_spec (datatype (native_datatype_element LONG RAW))) ;)) (body BEGIN (seq_of_statements (statement (return_statement RETURN (condition (expression (logical_and_expression (negated_expression (equality_expression (multiset_expression (relational_expression (compound_expression (concatenation (additive_expression (multiply_expression (datetime_expression (model_expression (unary_expression (atom (general_element (general_element_part (id_expression (regular_id z))))))))))))))))))))) ;) END) ;)) BEGIN (unit_statement (data_manipulation_language_statements (select_statement (subquery (subquery_basic_elements (query_block SELECT (selected_element (select_list_elements (expression (logical_and_expression (negated_expression (equality_expression (multiset_expression (relational_expression (compound_expression (concatenation (additive_expression (multiply_expression (datetime_expression (model_expression (unary_expression (atom (general_element (general_element_part (id_expression (regular_id mycol)))))))))))))))))))) , (selected_element (select_list_elements (expression (logical_and_expression (negated_expression (equality_expression (multiset_expression (relational_expression (compound_expression (concatenation (additive_expression (multiply_expression (datetime_expression (model_expression (unary_expression (standard_function CAST ( (concatenation_wrapper (concatenation (additive_expression (multiply_expression (datetime_expression (model_expression (unary_expression (atom (general_element (general_element_part (id_expression (regular_id col2)))))))))))) AS (type_spec (datatype (native_datatype_element RAW))) ))))))))))))))))) (into_clause INTO (variable_name (id_expression (regular_id a)))) (from_clause FROM (table_ref_list (table_ref (table_ref_aux (dml_table_expression_clause (tableview_name (id (id_expression (regular_id mytab))))))))) (where_clause WHERE (condition_wrapper (expression (logical_and_expression (negated_expression (equality_expression (multiset_expression (relational_expression (compound_expression (concatenation (additive_expression (multiply_expression (datetime_expression (model_expression (unary_expression (atom ( (expression_or_vector (expression (logical_and_expression (negated_expression (equality_expression (multiset_expression (relational_expression (compound_expression (concatenation (additive_expression (multiply_expression (datetime_expression (model_expression (unary_expression (atom (general_element (general_element_part (id_expression (regular_id b)))))))))))))) IS OF TYPE ( (type_spec (datatype (native_datatype_element LONG RAW))) , (type_spec (datatype (native_datatype_element RAW))) , (type_spec (datatype (native_datatype_element VARCHAR))) )))))) ))))) ; END ;)))))))))))))))))) unit_statement (unit_statement CREATE TABLE tab) (unit_statement (data_manipulation_language_statements (select_statement (subquery (subquery_basic_elements ( (subquery (subquery_basic_elements a RAW , b LONG RAW , c LONG VARCHAR , d LONG , x VARCHAR)) )) ;)))) <EOF>)

So it seems like some keywords(rules) like declare are missing in the PL/SQL grammar file. I used this plsql.g4 file with Antlr which is in the Antlr site.

Is there any way I can find an updated plsql.g4 file or do I need to add those rules myself?

like image 306
Chathura Wijeweera Avatar asked Dec 17 '15 12:12

Chathura Wijeweera


1 Answers

The problem seems to actually be that plsql antlr-grammar doesn't support LONG VARCHAR datatype, in this part: "c LONG VARCHAR;"

It leads to declaration being misinterpreted as a procedure call messing up rest of the parsing.

I have opened an issue in the grammar github: https://github.com/antlr/grammars-v4/issues/2158

like image 193
siggemannen Avatar answered Oct 05 '22 23:10

siggemannen