What lucene analyzer can be used to handle Japanese text?

Question

Which lucene analyzer can be used to handle Japanese text properly? It should be able to handle Kanji, Hiragana, Katakana, Romaji, and any of their combination.

adrianbanks · Accepted Answer

You should probably look at the CJK package that is in the contrib area of Lucene. There is an analyzer and a tokenizer specifically for dealing with Chinese, Japanese, and Korean.

Hakanai · Answer

I found lucene-gosen while doing a search for my own purposes:

Their example looks fairly decent, but I guess it's the kind of thing that needs extensive testing. I'm also worried about their backwards-compatibility policy (or rather, the complete lack of one.)

What lucene analyzer can be used to handle Japanese text?

Tags:

java

lucene

internationalization

analyzer

Franz See

2 Answers

adrianbanks

Hakanai

Recent Activity

Donate For Us

What lucene analyzer can be used to handle Japanese text?

Tags:

java

lucene

internationalization

analyzer

Franz See

2 Answers

adrianbanks

Hakanai

Related questions

Recent Activity

Donate For Us