I have found and used an algorithm to solve a problem I had. The current problem is that I am not sure if this is bottom-up or top-down.
I have the following grammar:
query ::= andterm
| andterm "ANDNOT" query
andterm ::= orterm
| orterm "AND" andterm
orterm ::= term
| term "OR" orterm
term ::= "(" query ")"
| <word>
And thus do I have the following code:
struct index {
hashmap *map;
char *qword;
}
void querynext(iter, index) {
if (list_hasnext(iter)) {
index->qword = list_next(iter);
}
else index->qword = "";
}
set_t *parsequery(index, iter) {
set_t *andterm;
andterm = parseand(index, iter);
if(strcmp(index->qword, "ANDNOT") == 0) {
qnext(iter, index);
return set_different(andterm, parsequery(index, iter)):
}
else return andterm;
}
set_t *parseand(index, iter) {
set_t *orterm;
orterm = parseor(index, iter);
if(strcmp(index->qword, "AND") == 0) {
qnext(iter, index);
return set_intersection(orterm, parseand(index, iter));
}
else return orterm;
}
set_t *parseor(index, iter) {
set_t *term;
term = parseterm(index, iter);
if(strcmp(index->qword, "OR") == 0) {
qnext(iter, index);
return set_difference(term, parseor(index, iter));
}
else return term;
}
set_t *parseterm(index, iter) {
if(strcmp(index->qword, "(") == 0) {
qnext(iter, index);
set_t *parseset = parsequery(index, iter);
if(strcmp(index->qword, ")") == 0) {
qnext(iter, index);
return perseset;
}
}
if(map_haskey(index->map, index->qword)) {
set_t *parsekey;
parsekey = map_get(index->map, index->qword);
qnext(iter, index);
return parsekey;
}
else {
set_t emptyset;
emptyset = set_create;
return empty;
}
}
The flow of the current algorithm will be like this: Sample input of "blue AND html".
When it is ran through parsequery it will be fed through this process: parsequery->parseand->parseor->parseterm.
In parseterm it will be found to be in the hasmap. Parseterm will return a set with "blue". Parseterm will also iterate the query list one step further using qnext.
In parseor we'll now have a new word that is being tested, "AND", it isn't strcmp so therefore parseor returns term.
In parseand it will be strcmp == 0 so it will be taken into the loop. qnext will be ran, then return the intersection of orterm set ("blue") and the recursive parseand(index, iter).
The word html will also be found in the hashmap and the set will be returned upto parseand. Parseand will then return the set_intersection of "blue" and "html".
I don't know what to call this parsing, really. I've read book up and book down, pdf after pdf on parsing, but I am not sure. We start on the top, feed in the word, but the design of the algorithm will send the word to the bottom, to parseterm and then work it's way up again.
Sorry if this was a long post. I tried to explain my problem as best as I could.
In computer science, a recursive descent parser is a kind of top-down parser built from a set of mutually recursive procedures (or a non-recursive equivalent) where each such procedure implements one of the nonterminals of the grammar.
Top-down parsing is a parsing technique that first looks at the highest level of the parse tree and works down the parse tree by using the rules of grammar while Bottom-up Parsing is a parsing technique that first looks at the lowest level of the parse tree and works up the parse tree by using the rules of grammar.
Recursive descent parsing is an example of top-down parsing.
Further Top-down parser is classified into 2 types: A recursive descent parser, and Non-recursive descent parser. Recursive descent parser is also known as the Brute force parser or the backtracking parser.
In your code, there is a procedure for each non-terminal symbol which recursively calls procedures for parsing other non-terminals according to the rules of the grammar. So it is a recursive descent parser. It constructs the parse tree(implicitly) from top to bottom, which makes it a top down parser. It doesn't really matter how the additional information is propagated.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With