Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing SQL Queries in C++ using Boost.Spirit

I have created a database engine in which I can create and modify tables, and add them to a database. For parsing the SQL queries, I have implemented the Boost.Spirit library using EBNF form. I have the parser setup properly and it successfully parses every rule.

My problem is I now have no idea how to integrate the two. The Boost.Spirit parser only validates that input is correct, however I need it to actually do something. I looked up semantic actions but they don't seem to handle what I'm looking for.

For example, if I have a query such as:
new_table <- SELECT (id < 5) old_table;

I want it to validate the input using the rules, then call the function
Table Database::Select(Table t , Condition c){ ... }
and pass the tokens as arguments.

How do I go about integrating the parser?

like image 733
SamTheSammich Avatar asked Sep 27 '12 18:09

SamTheSammich


People also ask

How can I speed up my query performance?

The way to make a query run faster is to reduce the number of calculations that the software (and therefore hardware) must perform.

What is parser and optimizer in SQL?

The MySQL server receives queries in the SQL format. Once a query is received, it first needs to be parsed, which involves translating it from what is essentially a textual format into a combination of internal binary structures that can be easily manipulated by the optimizer.

What is parsing of query?

Parsing of a query is the process by which this decision making is done that for a given query, calculating how many different ways there are in which the query can run. Every query must be parsed at least once. The parsing of a query is performed within the database using the Optimizer component.


1 Answers

Note: I opted to invent a sample grammar here for demonstration purposes, since your question doesn't show yours. Using the approach recommended here, it should not be hard to code a function to execute your queries after parsing.

I would really suggest building a parse tree.

I would recommend attribute propagation in preference to semantic actions. See e.g.

  • Boost Spirit: "Semantic actions are evil"?

Attribute propagation rules are very flexible in Spirit. The default exposed attributes types are well documented right with each Parser's documentation

  • http://www.boost.org/doc/libs/1_48_0/libs/spirit/doc/html/spirit/qi/reference.html

E.g. -qi::char_ would result in boost::optional<char> and qi::double_ | qi::int_ would result in boost::variant<double, int>.

You will probably want to accumulate the parsed elements in a AST datatype of your own invention, e.g.:

struct SelectStatement
{
    std::vector<std::string> columns, fromtables; 
    std::string whereclause; // TODO model as a vector<WhereCondition> :) 

    friend std::ostream& operator<<(std::ostream& os, SelectStatement const& ss)
    {
        return os << "SELECT [" << ss.columns.size() << " columns] from [" << ss.fromtables.size() << " tables]\nWHERE " + ss.whereclause;
    }
};

You could adapt this to Spirits attribute propagation machinery by adapting the struct as a Fusion sequence:

BOOST_FUSION_ADAPT_STRUCT(SelectStatement, 
        (std::vector<std::string>, columns)
        (std::vector<std::string>, fromtables)
        (std::string, whereclause)
       )

Now you could parse the following rule into that type:

sqlident = lexeme [ alpha >> *alnum ]; // table or column name

columns  = no_case [ "select" ] >> (sqlident % ',');
tables   = no_case [ "from" ]   >> (sqlident % ',');

start    = columns >> tables 
    >> no_case [ "where" ]
    >> lexeme [ +(char_ - ';') ]
    >> ';';

You can see this code running live here: http://liveworkspace.org/code/0b525234dbce22cbd8becd69f84065c1

Full demo code:

// #define BOOST_SPIRIT_DEBUG
#include <boost/fusion/adapted.hpp>
#include <boost/spirit/include/qi.hpp>

namespace qi    = boost::spirit::qi;

struct SelectStatement
{
    std::vector<std::string> columns, fromtables; 
    std::string whereclause; // TODO model as a vector<WhereCondition> :) 

    friend std::ostream& operator<<(std::ostream& os, SelectStatement const& ss)
    {
        return os << "SELECT [" << ss.columns.size() << " columns] from [" << ss.fromtables.size() << " tables]\nWHERE " + ss.whereclause;
    }
};

BOOST_FUSION_ADAPT_STRUCT(SelectStatement, 
        (std::vector<std::string>, columns)
        (std::vector<std::string>, fromtables)
        (std::string, whereclause)
       )

template <typename It, typename Skipper = qi::space_type>
    struct parser : qi::grammar<It, SelectStatement(), Skipper>
{
    parser() : parser::base_type(start)
    {
        using namespace qi;

        sqlident = lexeme [ alpha >> *alnum ]; // table or column name

        columns  = no_case [ "select" ] >> (sqlident % ',');
        tables   = no_case [ "from" ]   >> (sqlident % ',');

        start    = columns >> tables 
            >> no_case [ "where" ]
            >> lexeme [ +(char_ - ';') ]
            >> ';';

        BOOST_SPIRIT_DEBUG_NODE(start);
        BOOST_SPIRIT_DEBUG_NODE(sqlident);
        BOOST_SPIRIT_DEBUG_NODE(columns);
        BOOST_SPIRIT_DEBUG_NODE(tables);
    }

  private:
    qi::rule<It, std::string()             , Skipper> sqlident;
    qi::rule<It, std::vector<std::string>(), Skipper> columns  , tables;
    qi::rule<It, SelectStatement()         , Skipper> start;
};

template <typename C, typename Skipper>
    bool doParse(const C& input, const Skipper& skipper)
{
    auto f(std::begin(input)), l(std::end(input));

    parser<decltype(f), Skipper> p;
    SelectStatement query;

    try
    {
        bool ok = qi::phrase_parse(f,l,p,skipper,query);
        if (ok)   
        {
            std::cout << "parse success\n";
            std::cout << "query: " << query << "\n";
        }
        else      std::cerr << "parse failed: '" << std::string(f,l) << "'\n";

        if (f!=l) std::cerr << "trailing unparsed: '" << std::string(f,l) << "'\n";
        return ok;
    } catch(const qi::expectation_failure<decltype(f)>& e)
    {
        std::string frag(e.first, e.last);
        std::cerr << e.what() << "'" << frag << "'\n";
    }

    return false;
}

int main()
{
    const std::string input = "select id, name, price from books, authors where books.author_id = authors.id;";
    bool ok = doParse(input, qi::space);

    return ok? 0 : 255;
}

Will print output:

parse success
query: SELECT [3 columns] from [2 tables]
WHERE books.author_id = authors.id
like image 189
sehe Avatar answered Sep 20 '22 21:09

sehe