Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to implement a graph-structured stack?

Ok, so I would like to make a GLR parser generator. I know there exist such programs better than what I will probably make, but I am doing this for fun/learning so that's not important.

I have been reading about GLR parsing and I think I have a decent high level understanding of it now. But now it's time to get down to business.

The graph-structured stack (GSS) is the key data structure for use in GLR parsers. Conceptually I know how GSS works, but none of the sources I looked at so far explain how to implement GSS. I don't even have an authoritative list of operations to support. Can someone point me to some good sample code/tutorial for GSS? Google didn't help so far. I hope this question is not too vague.

like image 909
Emil Avatar asked Mar 19 '10 02:03

Emil


People also ask

What is graph stack?

In computer science, a graph-structured stack (GSS) is a directed acyclic graph where each directed path represents a stack. The graph-structured stack is an essential part of Tomita's algorithm, where it replaces the usual stack of a pushdown automaton.

What is graph data structure explain with an example?

A graph is a common data structure that consists of a finite set of nodes (or vertices) and a set of edges connecting them. A pair (x,y) is referred to as an edge, which communicates that the x vertex connects to the y vertex. In the examples below, circles represent vertices, while lines represent edges.

What do you mean by graph data structure?

Graphs in data structures are non-linear data structures made up of a finite number of nodes or vertices and the edges that connect them. Graphs in data structures are used to address real-world problems in which it represents the problem area as a network like telephone networks, circuit networks, and social networks.


1 Answers

Firstly, if you haven't already, you should read McPeak's paper on GLR http://www.cs.berkeley.edu/~smcpeak/papers/elkhound_cc04.ps. It is an academic paper, but it gives good details on GSS, GLR, and the techniques used to implement them. It also explains some of the hairy issues with implementing a GLR parser.

You have three parts to implementing a graph-structured stack.

I. The graph data structure itself

II. The stacks

III. GLR's use of a GSS

You are right, google isn't much help. And unless you like reading algorithms books, they won't be much help either.

I. The graph data structure

Rob's answer about "the direct representation" would be easiest to implement. It's a lot like a linked-list, except each node has a list of next nodes instead of just one.

This data structure is a directed graph, but as the McPeak states, the GSS may have cycles for epsilon-grammars.

II. The stacks

A graph-structured stack is conceptually just a list of regular stacks. For an unambiguous grammar, you only need one stack. You need more stacks when there is a parsing conflict so that you can take both parsing actions at the same time and maintain the different state both actions create. Using a graph allows you to take advantage of the fact that these stacks share elements.

It may help to understand how to implement a single stack with a linked-list first. The head of the linked list is the top of the stack. Pushing an element onto the stack is just creating a new head and pointing it to the old head. Popping an element off the stack is just moving the pointer to head->next.

In a GSS, the principle is the same. Pushing an element is just creating a new head node and pointing it to the old head. If you have two shift operations, you will push two elements onto the old head and then have two head nodes. Conceptually this is just two different stacks that happen share every element except the top ones. Popping an element is just moving the head pointer down the stack by following each of the next nodes.

III. GLR's use of the GSS

This is where McPeak's paper is a useful read.

The GLR algorithm takes advantage of the GSS by merging stack heads that have the same state element. This means that one state element may have more than one child. When reducing, the GLR algorithm will have to explore all possible paths from the stack head.

You can optimize GLR by maintaining the deterministic depth of each node. This is just the distance from a split in the stack. This way you don't always have to search for a stack split.

This is a tough task! So good luck!

like image 79
Paul Avatar answered Nov 18 '22 09:11

Paul