Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

character-wise confidence values using tesseract 3.01

Tags:

tesseract

i executed the following code to generate character-wise confidence values:

int main(int argc, char **argv) {

    const char *lang="eng";
    const PIX   *pixs;
     if ((pixs = pixRead(argv[1])) == NULL) {
       cout <<"Unsupported image type"<<endl;
        exit(3);
      }
    TessBaseAPI  api;
    api.SetVariable("save_blob_choices", "T");
    api.SetPageSegMode(tesseract::PSM_SINGLE_WORD  );        
    api.SetImage(pixs);
    int rc = api.Init(argv[0], lang);
    api.Recognize(NULL);
    ResultIterator* ri = api.GetIterator();
    if(ri != 0)
    {
        do
        {
            const char* symbol = ri->GetUTF8Text(RIL_SYMBOL);
            if(symbol != 0)
            {
                float conf = ri->Confidence(RIL_SYMBOL);
                cout<<"\nnext symbol: "<< symbol << " confidence: " << conf <<"\n" <<endl;

             }


            delete[] symbol;
                }    while((ri->Next(RIL_SYMBOL)));
    }
    return 0;
}

link to image

the output obtained for the above image was:

next symbol: N confidence: 72.3563 next symbol: B confidence: 72.3563

next symbol: E confidence: 69.9937 next symbol: T confidence: 69.9937
next symbol: R confidence: 69.9937 next symbol: A confidence: 69.9937
next symbol: N confidence: 69.9937 next symbol: G confidence: 69.9937
next symbol: - confidence: 69.9937 next symbol: I confidence: 69.9937

As is evident, the confidence values for characters belonging to the same word is the same. Is this the expected output? Shouldn't the confidence values be different for each character? I tried executing the code for a word in which each character was in different font style..and yet, the confidence value was the same for characters belonging to the same word.

like image 772
Ekta Avatar asked Jun 19 '12 10:06

Ekta


1 Answers

The issue is that you're calling Init after the SetVariable call.

like image 182
Kaolin Fire Avatar answered Sep 27 '22 16:09

Kaolin Fire