Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get TermAttribute in TokenStream Lucene.Net

I use Lucene.NET3.0.3 how to get TermAttribute.I try my best but i can't get it

here source:

    Analyzer analyzer = new Lucene.Net.Analysis.Snowball.SnowballAnalyzer(Lucene.Net.Util.Version.LUCENE_30, "English",stopword);

    TokenStream tokenStream = analyzer.TokenStream("English", new StringReader("How to get TermAttribute"));

    while (tokenStream.IncrementToken())
    {
         ???How to get TermAttribute

    }
like image 330
NetS Avatar asked Apr 29 '13 08:04

NetS


Video Answer


2 Answers

var termAttr = tokenStream.GetAttribute<Lucene.Net.Analysis.Tokenattributes.ITermAttribute>();

while(tokenStream.IncrementToken())
{
    string term = termAttr.Term;
}
like image 133
I4V Avatar answered Sep 29 '22 22:09

I4V


In Apache Lucene.Net 4.8 you can use this C# code: (.NET Core 2+) where term = termAttr.ToString() contains Token as string. Complete method: PrintTokens(Analyzer analyzer, string fieldName, string text) you can get at GitHub msigut/LuceneNet48Demo.

    var tokenStream = analyzer.GetTokenStream(fieldName, textToAnalyze);
    var termAttr = tokenStream.GetAttribute<ICharTermAttribute>();

    tokenStream.Reset();

    while (tokenStream.IncrementToken())
    {
        string term = termAttr.ToString();
    }
like image 44
Martin Avatar answered Sep 30 '22 00:09

Martin