i executed the following code to generate character-wise confidence values:
int main(int argc, char **argv) {
const char *lang="eng";
const PIX *pixs;
if ((pixs = pixRead(argv[1])) == NULL) {
cout <<"Unsupported image type"<<endl;
exit(3);
}
TessBaseAPI api;
api.SetVariable("save_blob_choices", "T");
api.SetPageSegMode(tesseract::PSM_SINGLE_WORD );
api.SetImage(pixs);
int rc = api.Init(argv[0], lang);
api.Recognize(NULL);
ResultIterator* ri = api.GetIterator();
if(ri != 0)
{
do
{
const char* symbol = ri->GetUTF8Text(RIL_SYMBOL);
if(symbol != 0)
{
float conf = ri->Confidence(RIL_SYMBOL);
cout<<"\nnext symbol: "<< symbol << " confidence: " << conf <<"\n" <<endl;
}
delete[] symbol;
} while((ri->Next(RIL_SYMBOL)));
}
return 0;
}
link to image
the output obtained for the above image was:
next symbol: N confidence: 72.3563 next symbol: B confidence: 72.3563
next symbol: E confidence: 69.9937 next symbol: T confidence: 69.9937
next symbol: R confidence: 69.9937 next symbol: A confidence: 69.9937
next symbol: N confidence: 69.9937 next symbol: G confidence: 69.9937
next symbol: - confidence: 69.9937 next symbol: I confidence: 69.9937
As is evident, the confidence values for characters belonging to the same word is the same. Is this the expected output? Shouldn't the confidence values be different for each character? I tried executing the code for a word in which each character was in different font style..and yet, the confidence value was the same for characters belonging to the same word.
The issue is that you're calling Init after the SetVariable call.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With