Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best way to store big prefix tree

Tags:

c#

.net

algorithm

I am going to write program which will represent AI for playing board game against player. I want to keep every played game in prefix tree an to search it for similar games. But i am afraid the tree can get too big to be kept in the memory. So what is the best way to store it. And to be able to search it fast. I don't think writing it to file is good solution. May be in some king of DB?

like image 768
IordanTanev Avatar asked May 20 '11 11:05

IordanTanev


1 Answers

What you're looking for is an embedded DB and most of them are written in C++, but there are some that have C# wrappers. I'd recommend Berkeley DB for .NET (which is a wrapper around Oracle's Berkeley DB).

What I would recommend is that you generate a unique hash for each prefix tree where the hashes generated would have a locality that properly represents similar prefix trees: in other words, hashes for two similar prefix trees should be very close to each other. The game that you're referring to is known as Tic-Tac-Toe, so hashing similar Tic-Tac-Toe games should be easy, here are some references (I didn't really read them, I just did a quick search for "hashing Tic-Tac-Toe" and those were the results):

  • I think this might be java example
  • TicTacToe strategic reduction
  • http://cg.scs.carleton.ca/~luc/1997notes/topic14/

The hash is then stored in the Berkeley DB and the prefix tree is stored in an aux file, or if you want you can also store it in the value. Since Berkeley DB stores key-value pairs, you can set the hash as the key and the value to anything (i.e. your prefix tree or a path to an aux file containing your prefix tree). Then all you do is look up similar hashes and retrieve the corresponding trees from the aux file(s).

Berkeley DB stores similar keys in sequence, so you can rely on the fact that it wont move the keys around and break the locality of your hashes. Since the locality won't be broken, you can do an additional optimization and retrieve a large page of key-value pairs and reduce the number of look-ups and seeks you do on the disk.

like image 187
Kiril Avatar answered Oct 06 '22 21:10

Kiril