X-Y Fast Trie in real world applications

Tags:

I am trying to understand the X and Y Fast Trie data structures and it's not clear why that structures are not used in large database since their asymptotic complexity is less than Log(N). In cases where we have a database of Terabytes, is not better use a Y Fast Trie than for example a B-tree?

719

asked Oct 13 '14 17:10

4nf3rt

1 Answers

There are a few reasons that X-fast or Y-fast tries might not be useful in practice. Here are a few:

X-fast tries internally require several linked structures, including a bitwise trie and a doubly-linked list of elements. These don't perform well in database environments where elements are stored on disks and following a pointer can require a disk seek. (For similar reasons, databases often use B-trees over binary search trees). Additionally, they require the use of balanced binary search trees augmented with information to perform a split or join, which adds in extra space and introduces even more pointers to follow.
X-fast tries internally require the use of hash tables with worst-case O(1) lookups. Hash tables with these requirements usually require a variety of hash functions to be applied to look up an element and (generally speaking) don't have the best locality compared to, say, a linear-probing hash table, so lookups are a bit slower.
Because of the hash tables in X-fast tries and the use of splitting and joining of BSTs in Y-fast tries, these two data structures are only amortized efficient rather than worst-case efficient. In some cases, this is unacceptable - it would be bad if, periodically, a database query ends up taking 100x or 1000x normal time even if on average everything works out quite well.
The use of hash tables in X-fast tries and Y-fast tries means that there is an element of randomness involved in the runtimes of the data structures. On expectation they're efficient, but it's possible that due to bad luck, the runtimes of the data structures might be quite high. Specifically, the cost of doing a rehash in an internal hash table or doing a split or join on a tree can be quite high. In a database implementation, reliability is important, so this randomness might hurt.
Due to all the reasons outlined above, the constant factors buried in the runtimes of X-fast and Y-fast tries are quite large. In the long run, they should end up being faster than other data structures, but "the long run" might require inputs that are vastly larger than the sorts of data sets that could feasibly fit into a database.

Hope this helps!

answered Nov 15 '22 04:11

templatetypedef

Related questions
                            
                                How to update Sqlite database for only one column using WHERE clause in android [duplicate]
                            
                                How to handle too many concurrent connections even after using a connection pool?
                            
                                ReactiveMongoRepository not saving my data
                            
                                Best .NET Solution for Frequently Changed Database [closed]
                            
                                How can I copy a mySQL Database in ruby on rails?
                            
                                Separating code from DB functionality
                            
                                How do you write your applications to be database independent?
                            
                                how to remove large objects in postgres
                            
                                Which free native XML database is most popular?
                            
                                SQL Developer explain plan broken
                            
                                Java, ResultSet.close(), PreparedStatement.close() -- what for?
                            
                                HTML Format in sp_send_dbmail
                            
                                Android - dictionary file. Which is faster, database or reading file directly?
                            
                                Does it make sense to create new table or add fields
                            
                                Generate database with Nhibernate using Fluent NHibernate
                            
                                How many joins are feasible in practice
                            
                                SQL UPDATE with LIKE
                            
                                Understanding 3NF: plain English please
                            
                                Columnstore index proper usage
                            
                                What rules apply to naming a mysql column?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

X-Y Fast Trie in real world applications

Tags:

time-complexity

database

data-structures

4nf3rt

People also ask

1 Answers

templatetypedef

Recent Activity

Donate For Us