Algorithms and Data Structures best suited for a spell checker, dictionary and a thesaurus

2 Answers

I can see no better data structure than a trie for the dictionary and the thesaurus. Both can be fitted in one structure if needed with one link in the node pointing to the meaning of the word (dictionary) and one to synonyms (thesaurus). It can even offer some form of autocompletion (when there's only one link in the node).

Spelling corrector is a bit trickier - since one has to map fals input to some kind of correct input. You can take this link as a start: http://en.wikipedia.org/wiki/Spell_checker. At the end you'll find links to papers about different algorithms. According to the wikipedia article, this paper describes the most successfull algorithm: Andrew Golding and Dan Roth's "Winnow-based spelling correction algorithm"

193

answered Nov 15 '22 13:11

Tobias Langner

See this for a 21-line Python 2.5 spelling corrector and a bit of background.

answered Nov 15 '22 13:11

Anton Gogolev

Related questions
                            
                                friend vs member functions in Operator Overloading C++
                            
                                Is it dangerous to create pure virtual function of a virtual function?
                            
                                How can I use add_library in 'CMakeLists.txt' to include entire files(.cpp,.h etc) in a directory [duplicate]
                            
                                cmake - targeting multiple architectures at once without manual cleaning between builds
                            
                                Why does list initialization allow conversion from double to float values?
                            
                                What's the Difference Between floor and duration_cast?
                            
                                C++ multiple unique pointers from same raw pointer
                            
                                Nested constexpr-if statement in discarded branch is still evaluated?
                            
                                Range-based for loop on a temporary range [duplicate]
                            
                                why perf has such high context-switches?
                            
                                C++: What's wrong with const public members?
                            
                                What's the shortest path in C++11 (or newer) to create an RAII wrapper without having to write a new class?
                            
                                Default template parameter with class
                            
                                "temporary of type 'A' has protected destructor", but its type is B
                            
                                Why does std::vector work with incomplete types in class definitions?
                            
                                Why do I not get guaranteed copy elision with std::tuple?
                            
                                Constructor taking std::initializer_list is preferred over other constructors
                            
                                Everything a c++ developer should know about network programming?
                            
                                Hibernate like layer for C++
                            
                                Eye-Tracking library in C#, C/C++ or Objective-C [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Algorithms and Data Structures best suited for a spell checker, dictionary and a thesaurus

Tags:

c++

c

algorithm

data-structures

Vivek Sharma

People also ask

2 Answers

Tobias Langner

Anton Gogolev

Recent Activity

Donate For Us