I was doing some pdf text extractions. I have attached screenshot of a scenario where i faced the problem.


Why the eclipse console failed to print the word "specification"?
Instead it is printed as "speci?cation".
I can see the characters overlapped.
But during debugging the code, the same text is shown without a "question mark".
Is there any way to print the same text to the console?
Please help.
The problem is the "fi" ligature ("overlapping letters") that is a single character in Unicode. In the debugging view the Windows methods for drawing text are used; these know about Unicode and can render the ligature correctly.
The console view uses a certain encoding. When used with Windows the default is "cp1252", Codepage 1252, or ISO 8859. These encodings do not know this specific letter and cannot print it, so the question mark is used as substitute.
You can set the encoding for Eclipse in general via Window > Preferences, General > Workspace, Text file encoding. While I think it is a good idea to use UTF-8 everywhere it may lead to problems with existing files.
You can set the encoding per project in the project properties, category Resource.
If you just want to set the encoding for the console view, the least immersive solution, it is not exactly intuitive. The console view encoding is a property of the runtime configuration you use for running your project. Run > Run Configurations..., your run configuration, Common.
When you use one of these methods to set the encoding to UTF-8 then the ligature will be printed correctly to the console view.
Of course the more general settings only have effect if not overwritten by more specific ones (Workspace, Project, Run Configuration).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With