Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

VIM thesaurus file

Tags:

vim

thesaurus

I have been poking around for a good solution for a vim thesaurus. The capability is built-in, obviously, but the file everyone seems to use is the mthesaur.txt. While it 'works' in the sense that the commands in insert mode bring up a list, it seems to me the results are programatically correct but not super useful. The vim online thesaurus plugin works very well, but the latency over the wire and necessity of using a split for the returned buffer is less than ideal. Anyone have an opinion about this?

like image 398
Thad Brown Avatar asked Oct 31 '15 15:10

Thad Brown


3 Answers

I have written a plugin that can address the two issues you raised here.

Multi-language Thesaurus Query plugin for Vim

It improves the using experience in two regards: more sensible synonym choosing mechanism; and better and more flexible synonym source(s).

Thesaurus_query.vim screen cast

By default, the plugin uses vim's messagebox for candidate display, with each synonym labeled by a number. And it let user choose the suitable one to replace the word under cursor by typing in its number. It works similar to vim's default spell correction prompt. And drastically reduced the operation time for choosing proper synonym from a long list of candidates.

To improve the quality of synonym candidates, multiple query backends were used. For English user, two are note worthy.

  • thesaurus_com Backend using Thesaurus.com as synonym source
  • mthesaur_txt Backend using mthesaur.txt as synonym source

thesaurus_com Backend will work straight away. For Local Query Backend to work, you will need to download mthesaur.txt and tell the plugin where it is located either by setting variable thesaurus or specifying variable g:tq_mthesaur_file. Or else only Online Backend will be functional.

By default, Online Query Backend will be used first. But if internet is not available or too slow, future query in the current vim session will be handled by Local Query Backend first to reduce latency time. Priority of these two backends can also be manually altered(see documentation).

To address the latency issue(which usually stands out when the word is not found), I have introduced a timeout mechanism. You may set

let g:tq_online_backends_timeout = 0.6

if your internet is reasonably fast. So that the latency could be reduced to under 0.6 second.

The plugin is written in Python, though. So you might want to use it with Vim compiled with Python and/or Python3 support.

like image 75
Chong Avatar answered Nov 08 '22 03:11

Chong


If your system is unix-like and if you have awk installed, then I have a simple solution to your problem that gives you access to thesauri in multiple languages without internet connection and without a split window either.

First download LibreOffice thesauri from:

https://cgit.freedesktop.org/libreoffice/dictionaries/tree/

for example.

(Look after th_*.dat files, these are the ones you need, not the .aff and .dic files which work only for spellchecking with Hunspell.) Download the *.dat thesauri of your liking and copy them to a subdirectory of the folder where you will put your plugin; this subdirectory should be called, "thes."

Now create a new file in your plugin folder (the folder where you should have the "thes" subdirectory with the *.dat thesauri inside) and put the following in this file:

" offer choice among installed thesauri
" ==================================================
let s:thesaurusPath = expand("<sfile>:p:h") . "/thes"

function! s:PickThesaurus()
    " 1, 1: glob does not ignore any pattern, returns a list
    let thesaurusList = glob(s:thesaurusPath . "/*", 1, 1)
    if len(thesaurusList) == 0
        echo "Nothing found in " . s:thesaurusPath
        return
    endif
    let index = 0
    let optionList = []
    for name in thesaurusList
        let index = index + 1
        let shortName = fnamemodify(name, ":t:r")
        let optionList += [index . ". " . shortName]
    endfor
    let choice = inputlist(["Select thesaurus:"] + optionList)
    let indexFromZero = choice - 1
    if (indexFromZero >= 0) && (indexFromZero < len(thesaurusList))
        let b:thesaurus = thesaurusList[indexFromZero]
    endif
endfunction

command! Thesaurus call s:PickThesaurus()

This will allow you to pick the thesaurus of your choice by typing :Thesaurus in Vim's command mode.

(Actually, if you plan to use only one thesaurus then you don't need any of this; just assign the full name of your thesaurus file to the buffer-local variable, b:thesaurus).

Finally, add the following to your plugin file:

" run awk on external thesaurus to find synonyms
" ==================================================
function! OmniComplete(findstart, base)
    if ! exists("b:thesaurus")
        return
    endif
    if a:findstart
        " first, must find word
        let line = getline('.')
        let wordStart = col('.') - 1
        " check backward, accepting only non-white space
        while wordStart > 0 && line[wordStart - 1] =~ '\S'
            let wordStart -= 1
        endwhile
        return wordStart
    else
        " a word with single quotes would produce a shell error
        if match(a:base, "'") >= 0
            return
        endif
        let searchPattern = '/^' . tolower(a:base) . '\|/'
        " search pattern is single-quoted
        let thesaurusMatch = system('awk'
            \ . " '" . searchPattern . ' {printf "%s", NR ":" $0}' . "'"
            \ . " '" . b:thesaurus . "'"
        \)
        if thesaurusMatch == ''
            return
        endif
        " line info was returned by awk
        let matchingLine = substitute(thesaurusMatch, ':.*$', '', '')
        " entry count was in the thesaurus itself, right of |
        let entryCount = substitute(thesaurusMatch, '^.*|', '', '')
        let firstEntry = matchingLine + 1
        let lastEntry = matchingLine + entryCount
        let rawOutput = system('awk'
            \ . " '" . ' NR == ' . firstEntry . ', NR == ' . lastEntry
            \ . ' {printf "%s", $0}' . "'"
            \ . " '" . b:thesaurus . "'"
        \)
        " remove dash tags if any
        let rawOutput = substitute(rawOutput, '^-|', '', '')
        let rawOutput = substitute(rawOutput, '-|', '|', 'g')
        " remove grammatical tags if any
        let rawOutput = substitute(rawOutput, '(.\{-})', '', 'g')
        " clean spaces left by tag removal
        let rawOutput = substitute(rawOutput, '^ *|', '', '')
        let rawOutput = substitute(rawOutput, '| *|', '|', 'g')
        let listing = split(rawOutput, '|')
        return listing
    endif
endfunction

" configure completion
" ==================================================
set omnifunc=OmniComplete
set completeopt=menuone

This will allow you to get the synonyms of any word you type in insert mode. While still in insert mode, press Ctrl-X Ctrl-O (or any key combination you mapped on omnicompletion) and a popup menu will show up with the synonym list.

This solution is very crude as compared to Chong's powerful plugin (see above), but it is lightweight and works well enough for me. I use it with thesauri in four different languages.

like image 26
François Tonneau Avatar answered Nov 08 '22 02:11

François Tonneau


Script for ~/.vimrc, it needs the file thesaurii.txt (merged dictionaries from https://github.com/moshahmed/vim/blob/master/thesaurus/thesaurii.txt) and perl.exe in path for searching for synonyms. Script tested on win7 and cygwin perl.

Calls aspell to do spell correction, if no synonyms are found. See https://stackoverflow.com/a/53825144/476175 on how to call this function on pressing [tab].

set thesaurus=thesaurii.txt
let s:thesaurus_pat = "thesaurii.txt"

set completeopt+=menuone
set omnifunc=MoshThesaurusOmniCompleter
function!    MoshThesaurusOmniCompleter(findstart, base)
    " == First call: find-space-backwards, see :help omnifunc
    if a:findstart
        let s:line = getline('.')
        let s:wordStart = col('.') - 1
        " Check backward, accepting only non-white space
        while s:wordStart > 0 && s:line[s:wordStart - 1] =~ '\S'
            let s:wordStart -= 1
        endwhile
        return s:wordStart

    else
        " == Second call: perl grep thesaurus for word_before_cursor, output: comma separated wordlist
        " == Test: so % and altitude[press <C-x><C-o>]
        let a:word_before_cursor = substitute(a:base,'\W','.','g')
        let s:cmd='perl -ne ''chomp; '
                    \.'next if m/^[;#]/;'
                    \.'print qq/$_,/ if '
                      \.'/\b'.a:word_before_cursor.'\b/io; '' '
                    \.s:thesaurus_pat
        " == To: Debug perl grep cmd, redir to file and echom till redir END.
        " redir >> c:/tmp/vim.log
        " echom s:cmd
        let   s:rawOutput = substitute(system(s:cmd), '\n\+$', '', '')
        " echom s:rawOutput
        let   s:listing = split(s:rawOutput, ',')
        " echom join(s:listing,',')
        " redir END
        if len(s:listing) > 0
          return s:listing
        endif

        " Try spell correction with aspell: echo mispeltword | aspell -a
        let s:cmd2 ='echo '.a:word_before_cursor
            \.'|aspell -a'
            \.'|perl -lne ''chomp; next unless s/^[&]\s.*?:\s*//;  print '' '
        let   s:rawOutput2 = substitute(system(s:cmd2), '\n\+$', '', '')
        let   s:listing2 = split(s:rawOutput2, ',\s*')
        if len(s:listing2) > 0
          return s:listing2
        endif

        " Search dictionary without word delimiters.
        let s:cmd3='perl -ne ''chomp; '
                    \.'next if m/^[;#]/;'
                    \.'print qq/$_,/ if '
                      \.'/'.a:word_before_cursor.'/io; '' '
                    \.&dictionary
        let   s:rawOutput3 = substitute(system(s:cmd3), '\n\+$', '', '')
        let   s:listing3 = split(s:rawOutput3, ',\s*')
        if len(s:listing3) > 0
          return s:listing3
        endif

        " Don't return empty list
        return [a:word_before_cursor, '(no synonyms or spell correction)']

    endif
endfunction  
like image 2
7 revs Avatar answered Nov 08 '22 02:11

7 revs