I am looking for a relatively quick way to check whether words are misspelled, either using a gem or API.
I've tried using several gems -- raspell, ffi-aspell, hunspell-ffi, spell_cheker, and spellchecker -- and each has a different error.
I'm pretty new to ruby and hoping for a simple solution (I'm processing a lot of short text files and want to calculate the % of words mis-spelled) that doesn't include building something from scratch.
When trying ffi-aspell, I get the following error:
/Users/ntaylorthompson/.rvm/gems/ruby-1.9.2-p320/gems/ffi-aspell-0.0.3/lib/ffi/aspell/speller.rb:121: [BUG] Segmentation fault
ruby 1.9.2p320 (2012-04-20 revision 35421) [x86_64-darwin11.4.0]
-- control frame ----------
c:0005 p:---- s:0019 b:0019 l:000018 d:000018 CFUNC :speller_check
c:0004 p:0113 s:0013 b:0013 l:000012 d:000012 METHOD /Users/ntaylorthompson/.rvm/gems/ruby-1.9.2-p320/gems/ffi-aspell-0.0.3/lib/ffi/aspell/speller.rb:121
c:0003 p:0049 s:0007 b:0007 l:0005a8 d:0005d0 EVAL ffi-aspell_test.rb:5
c:0002 p:---- s:0004 b:0004 l:000003 d:000003 FINISH
c:0001 p:0000 s:0002 b:0002 l:0005a8 d:0005a8 TOP
---------------------------
-- Ruby level backtrace information ----------------------------------------
ffi-aspell_test.rb:5:in `<main>'
/Users/ntaylorthompson/.rvm/gems/ruby-1.9.2-p320/gems/ffi-aspell-0.0.3/lib/ffi/aspell/speller.rb:121:in `correct?'
/Users/ntaylorthompson/.rvm/gems/ruby-1.9.2-p320/gems/ffi-aspell-0.0.3/lib/ffi/aspell/speller.rb:121:in `speller_check'
-- C level backtrace information -------------------------------------------
[NOTE]
You may have encountered a bug in the Ruby interpreter or extension libraries.
Bug reports are welcome.
For details: http://www.ruby-lang.org/bugreport.html
Abort trap: 6
I'd appreciate either (1) a suggestion of an alternative approach to those above or (2) a recommendation of which to use of the 5 gems above -- so I can at least spend time debugging the best option.
Turn on (or off) automatic spelling and grammar checkingOn the Word menu, click Preferences > Spelling & Grammar. In the Spelling & Grammar dialog box, under Spelling, check or clear the Check spelling as you type box. Under Grammar, check or clear the Check grammar as you type box.
Spell checkers can use approximate string matching algorithms such as Levenshtein distance to find correct spellings of misspelled words. An alternative type of spell checker uses solely statistical information, such as n-grams, to recognize errors instead of correctly-spelled words.
Word's spell check function is set to automatically check your spelling while you type. Errors in your document will have color-coded underlines reflecting your choices, like red for spelling errors, green for grammar errors, and blue for contextual spelling errors. ... You can also always correct the word on your own.
raspell is no longer maintained, so ffi-aspell is a good option if you have the libaspell headers available.
If you can't get the libraries to work, you can just shell out to the aspell
binary. The following method will do just that (unit tests included):
# Returns the percentage of incorrect words per document
#
def spellcheck(filename)
fail "File #{filename} does not exist" unless File.exists?(filename)
words = Float(`wc -w #{filename}`.split.first)
wrong = Float(`cat #{filename} | aspell --list | wc -l`.split.first)
wrong / words
end
if $0 == __FILE__
require 'minitest/autorun'
require 'tempfile'
describe :spellcheck do
def write(str)
@file.write str
@file.read
end
before do
@file = Tempfile.new('document')
end
it 'fails when given a bad path' do
-> { spellcheck('/tmp/does/not/exist') }.must_raise RuntimeError
end
it 'returns 0.0 if there are no misspellings' do
write 'The quick brown fox'
spellcheck(@file.path).must_equal 0.0
end
it 'returns 0.5 if 2/4 words are misspelled' do
write 'jumped over da lacie'
spellcheck(@file.path).must_be_close_to 0.5, 1e-8
end
it 'returns 1.0 if everything is misspelled' do
write 'Da quyck bown foxx jmped oer da lassy dogg'
spellcheck(@file.path).must_equal 1.0, 1e-8
end
after do
@file.close
@file.unlink
end
end
end
spellcheck()
assumes you have cat
, wc
, and aspell
on your path, and that the default dictionary is what you want to use. The unit test is for Ruby 1.9 only -- if you're running 1.8, just delete it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With