Auto correct algorithm

My Approach

1) I am planning to build a tree so searching will efficient.

2) I am not getting how to implement auto correction feature.

3) I can implement auto complete feature using trees

My tree Image

What's the best data structure and algorithm to implement all the above features?

684

asked Dec 11 '13 07:12

Ankith

2 Answers

I have been working on the same problem. So far the best solution I have come across is using a ternary search tree for auto completion. Ternary Search Trees are more space efficient than tries. If im unable to find the looked up string in my ternary search tree then I use an already built BK Tree for finding the closest match. BK Tree internally uses Levenshtein distance. You

Metaphones are also something you might want to explore however I havent gone into the depth of metaphones.

I have a solution in Java for BK TREE + TERNARY SEARCH TREE if you like.

answered Oct 04 '22 10:10

Pawan

You can do autocomplete by looking at all the strings in a given subtree. Some score to help you pick might help. This works something like if you have "te" you go down that path in the trie and the traverse the entire subtree there to get all the possible endings.

For corrections you need to implement something like http://en.wikipedia.org/wiki/Levenshtein_distance over the tree. You can use the fact that if you processed a given path in the trie, you can reuse the result for all the strings in the subtree rooted at the end of your path.

answered Oct 04 '22 08:10

Sorin

Related questions
                            
                                Add include paths and shared library for g++ permanently
                            
                                forcing clang to emit code for never-referenced static member function of class-template instantiation
                            
                                Sparse array compression using SIMD (AVX2)
                            
                                error C4996: 'std::_Copy_impl': Function call with parameters that may be unsafe
                            
                                What is the hash of a disengaged std::optional<T> object?
                            
                                Is this reinterpret_cast OK to do
                            
                                Why is IEEE-754 Floating Point not exchangable between platforms?
                            
                                Type deduction given member function pointer with variadic templates
                            
                                Java equivalent of C++ copy assignment operator
                            
                                How to convert string to template type
                            
                                C++ implicit conversion constructor call
                            
                                Physical layout on disk of large cross-platform C++ project with many third party dependencies
                            
                                C# GetFunctionPointerForDelegate cdecl instead of stdcall
                            
                                Initializer-list for initializing 2D std::array member
                            
                                Is there in Qt forms onChange event?
                            
                                Lifetime of std::thread arguments
                            
                                Render QImage with OpenGL
                            
                                Is it possible to add files to a CMake generated solution folder in Visual Studio?
                            
                                is it possible to use function pointers this way?
                            
                                Why are "const Eigen::Matrix<>&" and "const Ref<Eigen::Matrix<> >" apparently incompatible?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Auto correct algorithm

Tags:

c++

algorithm

data-structures

tree